Why can't Celery daemon see tasks? - python

I have a Django 1.62 application running on Debian 7.8 with Nginx 1.2.1 as my proxy server and Gunicorn 19.1.1 as my application server. I've installed Celery 3.1.7 and RabbitMQ 2.8.4 to handle asynchronous tasks. I'm able to start a Celery worker as a daemon but whenever I try to run the test "add" task as shown in the Celery docs, I get the following error:
Received unregistred task of type u'apps.photos.tasks.add'.
The message has been ignored and discarded.
Traceback (most recent call last):
File "/home/swing/venv/swing/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 455, in on_task_received
strategies[name](message, body,
KeyError: u'apps.photos.tasks.add'
All of my configuration files are kept in a "conf" directory that sits just below my "myproj" project directory. The "add" task is in apps/photos/tasks.py.
myproj
│
├── apps
   ├── photos
   │   ├── __init__.py
   │   ├── tasks.py
conf
├── celeryconfig.py
├── celeryconfig.pyc
├── celery.py
├── __init__.py
├── middleware.py
├── settings
│   ├── base.py
│   ├── dev.py
│   ├── __init__.py
│   ├── prod.py
├── urls.py
├── wsgi.py
Here is the tasks file:
# apps/photos/tasks.py
from __future__ import absolute_import
from conf.celery import app
#app.task
def add(x, y):
return x + y
Here are my Celery application and configuration files:
# conf/celery.py
from __future__ import absolute_import
import os
from celery import Celery
from django.conf import settings
from conf import celeryconfig
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'conf.settings')
app = Celery('conf')
app.config_from_object(celeryconfig)
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
# conf/celeryconfig.py
BROKER_URL = 'amqp://guest#localhost:5672//'
CELERY_RESULT_BACKEND = 'amqp'
CELERY_ACCEPT_CONTENT = ['json', ]
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
This is my Celery daemon config file. I commented out CELERY_APP because I've found that the Celery daemon won't even start if I uncomment it. I also found that I need to add the "--config" argument to CELERYD_OPTS in order for the daemon to start. I created a non-privileged "celery" user who can write to the log and pid files.
# /etc/default/celeryd
CELERYD_NODES="worker1"
CELERYD_LOG_LEVEL="DEBUG"
CELERY_BIN="/home/myproj/venv/myproj/bin/celery"
#CELERY_APP="conf"
CELERYD_CHDIR="/www/myproj/"
CELERYD_OPTS="--time-limit=300 --concurrency=8 --config=celeryconfig"
CELERYD_LOG_FILE="/var/log/celery/%N.log"
CELERYD_PID_FILE="/var/run/celery/%N.pid"
CELERYD_USER="celery"
CELERYD_GROUP="celery"
CELERY_CREATE_DIRS=1
I can see from the log file that when I run the command, "sudo service celeryd start", Celery starts without any errors. However, if I open the Python shell and run the following commands, I'll see the error I described at the beginning.
$ python shell
In [] from apps.photos.tasks import add
In [] result = add.delay(2, 2)
What's interesting is that if I examine Celery's registered tasks object, the task is listed:
In [] import celery
In [] celery.registry.tasks
Out [] {'celery.chain': ..., 'apps.photos.tasks.add': <#task: apps.photos.tasks.add of conf:0x16454d0> ...}
Other similar questions here have discussed having a PYTHONPATH environment variable and I don't have such a variable. I've never understood how to set PYTHONPATH and this project has been running just fine for over a year without it.
I should also add that my production settings file is conf/settings/prod.py. It imports all of my base (tier-independent) settings from base.py and adds some extra production-dependent settings.
Can anyone tell me what I'm doing wrong? I've been struggling with this problem for three days now.
Thanks!

Looks like it is happening due to relative import error.
>>> from project.myapp.tasks import mytask
>>> mytask.name
'project.myapp.tasks.mytask'
>>> from myapp.tasks import mytask
>>> mytask.name
'myapp.tasks.mytask'
If you’re using relative imports you should set the name explicitly.
#task(name='proj.tasks.add')
def add(x, y):
return x + y
Checkout: http://celery.readthedocs.org/en/latest/userguide/tasks.html#automatic-naming-and-relative-imports

I'm using celery 4.0.2 and django, and I created a celery user and group for use with celeryd and had this same problem. The command-line version worked fine, but celeryd was not registering the tasks. It was NOT a relative naming problem.
The solution was to add the celery user to the group that can access the django project. In my case, this group is www-data with read, execute, and no write.

Related

Django import messes up Celery

I have a django project with the following Django project structure:
project/
...
some_app/
__init__.py
some_module_where_i_import_some_utils.py
server/
__init__.py
settings/
__init__.py
common.py
dev.py
...
celery.py
...
utils/
__init__.py
some_utils.py
manage.py
...
When using utils I import them the following way:
from project.utils.some_utils import whatever
And it works well. However when I run celery worker using DJANGO_SETTINGS_MODULE=server.settings.dev celery -A server worker --beat -l info autodiscover_tasks fails with the following error ModuleNotFoundError: No module named 'project'.
Here are contents of server/celery.py:
import os
from celery import Celery
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "server.settings.prod")
app = Celery("server")
app.config_from_object("django.conf:settings", namespace="CELERY")
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print("Request: {0!r}".format(self.request))
Here is server/__init__.py:
from .celery import app as celery_app
__all__ = ("celery_app",)
Modifying celery.py the following way did the job:
import os
import sys
from celery import Celery
sys.path.append("..")
...
I'm not sure if this could cause problems in the future and will continue looking into it and update the answer if I come up with something better.

consumer: Cannot connect to amqp://guest:**#127.0.0.1:5672//: [Errno 61] Connection refused

This is my project structure
myproj
│
├── app1
├── __init__.py
├── tasks.py
|---gettingstarted
├── __init__.py
├── urls.py
├── settings.py
│
├── manage.py
|-- Procfile
In gettingstarted/settings:
BROKER_URL = 'redis://'
In Procfile:
web: gunicorn gettingstarted.wsgi --log-file -
worker: celery worker --app=app1.tasks.app
In app1/tasks.py
from __future__ import absolute_import, unicode_literals
import random
import celery
import os
app = celery.Celery('hello')
#app.task
def add(x, y):
return x + y
When I run "celery worker" it gives me:
consumer: Cannot connect to amqp://guest:**#127.0.0.1:5672//: [Errno 61] Connection refused.
You're not configuring celery from your django settings. To integrate celery with django, it's best to just follow the guide:
from __future__ import absolute_import, unicode_literals
import random
import celery
import os
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'gettingstarted.settings')
app = celery.Celery('hello')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
#app.task
def add(x, y):
return x + y
And in settings.py change BROKER_URL to CELERY_BROKER_URL.

Running supervisord from the host, celery from a virtualenv (Django app)

I'm trying to use celery and redis queue to perform a task for my Django app. Supervisord is installed on the host via apt-get, whereas celery resides in a specific virtualenv on my system, installed via `pip.
As a result, I can't seem to get the celery command to run via supervisord. If I run it from inside the virtualenv, it works fine, outside of it, it doesn't. How do I get it to run under my current set up? Is the solution simply to install celery via apt-get, instead of inside the virtualenv? Please advise.
My celery.conf inside /etc/supervisor/conf.d is:
[program:celery]
command=/home/mhb11/.virtualenvs/myenv/local/lib/python2.7/site-packages/celery/bin/celery -A /etc/supervisor/conf.d/celery.conf -l info
directory = /home/mhb11/somefolder/myproject
environment=PATH="/home/mhb11/.virtualenvs/myenv/bin",VIRTUAL_ENV="/home/mhb11/.virtualenvs/myenv",PYTHONPATH="/home/mhb11/.virtualenvs/myenv/lib/python2.7:/home/mhb11/.virtualenvs/myenv/lib/python2.7/site-packages"
user=mhb11
numprocs=1
stdout_logfile = /etc/supervisor/logs/celery-worker.log
stderr_logfile = /etc/supervisor/logs/celery-worker.log
autostart = true
autorestart = true
startsecs=10
stopwaitsecs = 600
killasgroup = true
priority = 998
And the folder structure for my Django project is:
/home/mhb11/somefolder/myproject
├── myproject
│ ├── celery.py # The Celery app file
│ ├── __init__.py # The project module file (modified)
│ ├── settings.py # Including Celery settings
│ ├── urls.py
│ └── wsgi.py
├── manage.py
├── celerybeat-schedule
└── myapp
├── __init__.py
├── models.py
├── tasks.py # File containing tasks for this app
├── tests.py
└── views.py
If I do a status check via supervisorctl, I get a FATAL error on the command I'm trying to run in celery.conf. Help!
p.s. note that user mhb11 does not have root privileges, in case it matters. Moreover, /etc/supervisor/logs/celery-worker.log is empty. And inside supervisord.log the relevant error I see is INFO spawnerr: can't find command '/home/mhb11/.virtualenvs/redditpk/local/lib/python2.7/site-packages/celery/bin/‌​celery'.
Path to celery binary is myenv/bin/celery whereas you are using myenv/local/lib/python2.7/site-packages/celery/bin/cel‌‌​​ery.
So if you try on your terminal the command you are passing to supervisor (command=xxx), you should get the same error.
You need to replace your command=xxx in your celery.conf with
command=/home/mhb11/.virtualenvs/myenv/bin/celery -A myproject.celery -l info
Note that I have also replaced -A parameter with celery app, instead of supervisor configuration. This celery app is relevant to your project directory set in celery.conf with
directory = /home/mhb11/somefolder/myproject
On a side note, if you are using Celery with Django, you can manage celery with Django's manage.py, no need to invoke celery directly. Like
python manage.py celery worker
python manage.py celery beat
For detail please read intro of Django Celery here.

Configure app to deploy to Heroku

I posted a question earlier today here Heroku deploy problem.
I've had a lot of good suggestions, but could not get my app to deploy on Heroku.
I have stripped the app to 15 lines of code. The app still refuses to deploy.
This is the error:
ImportError: No module named 'main'
File "/app/.heroku/python/bin/gunicorn", line 11, in <module>
sys.exit(run())
WSGIApplication("%(prog)s [OPTIONS] [APP_MODULE]").run()
This is my app's directory:
This is the content of the Procfile:
web: gunicorn main:app --log-file=-
This is the content of the main.py file:
import os
from flask import Flask
app = Flask(__name__, instance_relative_config=True)
app.config.from_object('config')
app.config.from_pyfile('config.py')
#app.route('/')
def hello():
return 'Hello World!'
if __name__ == '__main__':
# REMEMBER: Never have this set to True on Production
# manager.run()
app.run()
I have followed all the tutorials, read up on modules and packages, saw suggestions on this site, read Explore Flask, and The Official Flask documentation. They ALL have some sort of variation of establishing an app and its very difficult to understand what is the right way or where files are supposed to be.
There are several problems in your example code:
You need a package.
No module named 'main'
in the proc file, you said: web: gunicorn main:app --log-file=-. The right way is add a __init__.py beside the main.py, so python knows that is a package. Edit your proc file to this:
web: gunicorn blackduckflock.main:app --log-file=-
The instance folder.
Since you specify instance_relative_config=True, I think the proper way to organize your project like this:
blackduckflock
├── blackduckflock
│   ├── __init__.py
│   └── main.py
├── config.py
├── instance
│   └── config.py
└── Procfile
And you can run gunicorn blackduckflock.main:app to see if it works.

Hosting Django app with Waitress

I'm trying to host a Django app on my Ubuntu VPS. I've got python, django, and waitress installed and the directories moved over.
I went to the Waitress site ( http://docs.pylonsproject.org/projects/waitress/en/latest/ ) and they said to use it like this:
from waitress import serve
serve(wsgiapp, host='5.5.5.5', port=8080)
Do I put my app name in place of of 'wsiapp'? Do I need to run this in the top-level Django project directory?
Tested with Django 1.9 and Waitress 0.9.0
You can use waitress with your django application by creating a script (e.g., server.py) in your django project root and importing the application variable from wsgi.py module:
yourdjangoproject project root structure
├── manage.py
├── server.py
├── yourdjangoproject
│   ├── __init__.py
│   ├── settings.py
│   ├── urls.py
│   ├── wsgi.py
wsgi.py (Updated January 2021 w/ static serving)
This is the default django code for wsgi.py:
import os
from django.core.wsgi import get_wsgi_application
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "yourdjangoproject.settings")
application = get_wsgi_application()
If you need static file serving, you can edit wsgi.py use something like whitenoise or dj-static for static assets:
import os
from django.core.wsgi import get_wsgi_application
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "yourdjangoproject.settings")
"""
YOU ONLY NEED ONE OF THESE.
Choose middleware to serve static files.
WhiteNoise seems to be the go-to but I've used dj-static
successfully in many production applications.
"""
# If using WhiteNoise:
from whitenoise import WhiteNoise
application = WhiteNoise(get_wsgi_application())
# If using dj-static:
from dj_static import Cling
application = Cling(get_wsgi_application())
server.py
from waitress import serve
from yourdjangoproject.wsgi import application
if __name__ == '__main__':
serve(application, port='8000')
Usage
Now you can run $ python server.py
I managed to get it working by using a bash script instead of a python call. I made a script called 'startserver.sh' containing the following (replace yourprojectname with your project name obviously):
#!/bin/bash
waitress-serve --port=80 yourprojectname.wsgi:application
I put it in the top-level Django project directory.
Changed the permissions to execute by owner:
chmod 700 startserver.sh
Then I just execute the script on the server:
sudo ./startserver.sh
And that seemed to work just fine.

Categories