Celery, Django and Scrapy: error importing from django app

Celery, Django and Scrapy: error importing from django app - python

I'm using celery (and django-celery) to allow a user to launch periodic scrapes through the django admin. This is part of a larger project but I've boiled the issue down to a minimal example.
Firstly, celery/celerybeat are running daemonized. If instead I run them with celery -A evofrontend worker -B -l info from my django project dir then I get no issues weirdly.
When I run celery/celerybeat as daemons however then I get a strange import error:
[2016-01-06 03:05:12,292: ERROR/MainProcess] Task evosched.tasks.scrapingTask[e18450ad-4dc3-47a0-b03d-4381a0e65c31] raised unexpected: ImportError('No module named myutils',)
Traceback (most recent call last):
File "/home/lee/Desktop/pyco/evo-scraping-min/venv/local/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task
R = retval = fun(*args, **kwargs)
File "/home/lee/Desktop/pyco/evo-scraping-min/venv/local/lib/python2.7/site-packages/celery/app/trace.py", line 438, in __protected_call__
return self.run(*args, **kwargs)
File "evosched/tasks.py", line 35, in scrapingTask
cs = CrawlerScript('TestSpider', scrapy_settings)
File "evosched/tasks.py", line 13, in __init__
self.crawler = CrawlerProcess(scrapy_settings)
File "/home/lee/Desktop/pyco/evo-scraping-min/venv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 209, in __init__
super(CrawlerProcess, self).__init__(settings)
File "/home/lee/Desktop/pyco/evo-scraping-min/venv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 115, in __init__
self.spider_loader = _get_spider_loader(settings)
File "/home/lee/Desktop/pyco/evo-scraping-min/venv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 296, in _get_spider_loader
return loader_cls.from_settings(settings.frozencopy())
File "/home/lee/Desktop/pyco/evo-scraping-min/venv/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 30, in from_settings
return cls(settings)
File "/home/lee/Desktop/pyco/evo-scraping-min/venv/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 21, in __init__
for module in walk_modules(name):
File "/home/lee/Desktop/pyco/evo-scraping-min/venv/local/lib/python2.7/site-packages/scrapy/utils/misc.py", line 71, in walk_modules
submod = import_module(fullpath)
File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
File "retail/spiders/Retail_spider.py", line 16, in <module>
ImportError: No module named myutils
i.e. the spider is having issues importing from the django project app despite adding the relevant things to syslog, and doing django.setup().
My hunch is that this may be caused by a " circular import" during initialization, but I'm not sure (see here for notes on same error)
Celery daemon config
For completeness the celeryd and celerybeat configuration scripts are:
# /etc/default/celeryd
CELERYD_NODES="worker1"
CELERY_BIN="/home/lee/Desktop/pyco/evo-scraping-min/venv/bin/celery"
CELERY_APP="evofrontend"
DJANGO_SETTINGS_MODULE="evofrontend.settings"
CELERYD_CHDIR="/home/lee/Desktop/pyco/evo-scraping-min/evofrontend"
CELERYD_OPTS="--concurrency=1"
# Workers should run as an unprivileged user.
CELERYD_USER="lee"
CELERYD_GROUP="lee"
CELERY_CREATE_DIRS=1
and
# /etc/default/celerybeat
CELERY_BIN="/home/lee/Desktop/pyco/evo-scraping-min/venv/bin/celery"
CELERY_APP="evofrontend"
CELERYBEAT_CHDIR="/home/lee/Desktop/pyco/evo-scraping-min/evofrontend/"
# Django settings module
export DJANGO_SETTINGS_MODULE="evofrontend.settings"
They are largely based on the the generic ones, with the Django settings thrown in and using the celery bin in my virtualenv rather than system.
I'm also using the init.d scripts which are the generic ones.
Project structure
As for the project: it lives at /home/lee/Desktop/pyco/evo-scraping-min. All files under it have ownership lee:lee.
The dir contains both a Scrapy (evo-retail) and Django (evofrontend) project that live under it and the complete tree structure looks like
├── evofrontend
│   ├── db.sqlite3
│   ├── evofrontend
│   │   ├── celery.py
│   │   ├── __init__.py
│   │   ├── settings.py
│   │   ├── urls.py
│   │   └── wsgi.py
│   ├── evosched
│   │   ├── __init__.py
│   │   ├── myutils.py
│   │   └── tasks.py
│   └── manage.py
└── evo-retail
└── retail
├── logs
├── retail
│   ├── __init__.py
│   ├── settings.py
│   └── spiders
│   ├── __init__.py
│   └── Retail_spider.py
└── scrapy.cfg
Django project relevant files
Now the relevant files: the evofrontend/evofrontend/celery.py looks like
# evofrontend/evofrontend/celery.py
from __future__ import absolute_import
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'evofrontend.settings')
from django.conf import settings
app = Celery('evofrontend')
# Using a string here means the worker will not have to
# pickle the object when using Windows.
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
The potentially relevant settings from the Django settings file, evofrontend/evofrontend/settings.py are
import os
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), os.pardir))
INSTALLED_APPS = (
...
'djcelery',
'evosched',
)
# Celery settings
BROKER_URL = 'amqp://guest:guest#localhost//'
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = 'Europe/London'
CELERYD_MAX_TASKS_PER_CHILD = 1 # Each worker is killed after one task, this prevents issues with reactor not being restartable
# Use django-celery backend database
CELERY_RESULT_BACKEND = 'djcelery.backends.database:DatabaseBackend'
# Set periodic task
CELERYBEAT_SCHEDULER = "djcelery.schedulers.DatabaseScheduler"
The tasks.py in the scheduling app, evosched, looks like (it just launches the Scrapy spider using the relevant settings after changing dir)
# evofrontend/evosched/tasks.py
from __future__ import absolute_import
from celery import shared_task
from celery.utils.log import get_task_logger
logger = get_task_logger(__name__)
import os
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings
from django.conf import settings as django_settings
class CrawlerScript(object):
def __init__(self, spider, scrapy_settings):
self.crawler = CrawlerProcess(scrapy_settings)
self.spider = spider # just a string
def run(self, **kwargs):
# Pass the kwargs (usually command line args) to the crawler
self.crawler.crawl(self.spider, **kwargs)
self.crawler.start()
#shared_task
def scrapingTask(**kwargs):
logger.info("Start scrape...")
# scrapy.cfg file here pointing to settings...
base_dir = django_settings.BASE_DIR
os.chdir(os.path.join(base_dir, '..', 'evo-retail/retail'))
scrapy_settings = get_project_settings()
# Run crawler
cs = CrawlerScript('TestSpider', scrapy_settings)
cs.run(**kwargs)
The evofrontend/evosched/myutils.py simply contains (in this min example):
# evofrontend/evosched/myutils.py
SCRAPY_XHR_HEADERS = 'SOMETHING'
Scrapy project relevant files
In the complete Scrapy project the settings file looks like
# evo-retail/retail/retail/settings.py
BOT_NAME = 'retail'
import os
PROJECT_ROOT = os.path.dirname(os.path.abspath(__file__))
SPIDER_MODULES = ['retail.spiders']
NEWSPIDER_MODULE = 'retail.spiders'
and (in this min example) the spider is just
# evo-retail/retail/retail/spiders/Retail_spider.py
from scrapy.conf import settings as scrapy_settings
from scrapy.spiders import Spider
from scrapy.http import Request
import sys
import django
import os
import posixpath
SCRAPY_BASE_DIR = scrapy_settings['PROJECT_ROOT']
DJANGO_DIR = posixpath.normpath(os.path.join(SCRAPY_BASE_DIR, '../../../', 'evofrontend'))
sys.path.insert(0, DJANGO_DIR)
os.environ.setdefault("DJANGO_SETTINGS_MODULE", 'evofrontend.settings')
django.setup()
from evosched.myutils import SCRAPY_XHR_HEADERS
class RetailSpider(Spider):
name = "TestSpider"
def start_requests(self):
print SCRAPY_XHR_HEADERS
yield Request(url='http://www.google.com', callback=self.parse)
def parse(self, response):
print response.url
return []
EDIT:
I discovered through lots of trial and error that if the app I'm trying to import from is in my INSTALLED_APPS django setting, then it fails with the import error, but if I remove the app from there then no longer do I get the import error (e.g. removing evosched from INSTALLED_APPS then the import in the spider goes through fine...). Obviously not a solution, but may be a clue.
EDIT 2
I put a print of sys.path immediately before the failing import in the spider, the result was
/home/lee/Desktop/pyco/evo-scraping-min/evofrontend/../evo-retail/retail
/home/lee/Desktop/pyco/evo-scraping-min/venv/lib/python2.7
/home/lee/Desktop/pyco/evo-scraping-min/venv/lib/python2.7/plat-x86_64-linux-gnu
/home/lee/Desktop/pyco/evo-scraping-min/venv/lib/python2.7/lib-tk
/home/lee/Desktop/pyco/evo-scraping-min/venv/lib/python2.7/lib-old
/home/lee/Desktop/pyco/evo-scraping-min/venv/lib/python2.7/lib-dynload
/usr/lib/python2.7
/usr/lib/python2.7/plat-x86_64-linux-gnu
/usr/lib/python2.7/lib-tk
/home/lee/Desktop/pyco/evo-scraping-min/venv/local/lib/python2.7/site-packages
/home/lee/Desktop/pyco/evo-scraping-min/evofrontend
/home/lee/Desktop/pyco/evo-scraping-min/evo-retail/retail`
EDIT 3
If I do import evosched then print dir(evosched), I see "tasks" and if I choose to include such a file, I can also see "models", so importing from models would actually be possible. I don't however see " myutils". Even from evosched import myutils fails and also fails if the statement is put in a function below rather than as a global(I thought this might route out a circular import issue...). The direct import evosched works...possibly import evosched.utils will work. Not yet tried...

It seems the celery daemon is running using the system's python and not the python binary inside the virtualenv. You need to use
# Python interpreter from environment.
ENV_PYTHON="$CELERYD_CHDIR/env/bin/python"
As mentioned here to tell celeryd to run using the python inside the virtualenv.

Related

How should the startup of a flask app be structured?

I have built a flask app that I have been starting from an if __name__ == '__main__': block, as I saw in a tutorial. When the time came to get the app to launch from wsgi for production use, I had to remove the port and host options from app.run(), and make changes to the structure of the launching code that I am not too sure about. I am now adding test cases, which adds more ways to launch and access the app (with app.test_client(), with app.test_request_context(), and who knows what else.) What is the right / recommended way to structure the code that creates and launches the application, so that it behaves correctly (and consistently) when launched stand-alone, from wsgi, and for testing?
My current structure is as follows:
def create_app():
"""
Create and initialize our app. Does not call its run() method
"""
app = Flask(__name__)
some_initialization(app, "config_file.json")
return app
app = create_app()
...
# Services decorated with #app.route(...)
...
if __name__ == "__main__":
# The options break wsgi, I had to use `run()`
app.run(host="0.0.0.0", port=5555)
To be clear, I have already gotten wsgi and tests to work, so this question is not about how to do that; it is about the recommended way to organize the state-creating steps so that the result behaves as a module, the app object can be created as many times as necessary, service parameters like port and server can be set, etc. What should my code outline actually look like?
In addition to the launch flag issue, the current code creates an app object (once) as a side effect of importing; I could create more with create_app() but from mycode import app will retrieve the shared object... and I wonder about all those decorators that decorated the original object.
I have looked at the documentation, but the examples are simplified, and the full docs present so many alternative scenarios that I cannot figure out the code structure that the creators of Flask envisioned. I expect this is a simple and it must have a well-supported code pattern; so what is it?

Disclaimer While this isn't the only struture for Flask, this has best suited my needs and is inspired from the Flask officials docs of
using a Factory
Pattern
Project Structure
Following the structure from the Documentation
/home/user/Projects/flask-tutorial
├── flaskr/
│ ├── __init__.py
│ ├── db.py
│ ├── schema.sql
│ ├── auth.py
│ ├── blog.py
│ ├── templates/
│ │ ├── base.html
│ │ ├── auth/
│ │ │ ├── login.html
│ │ │ └── register.html
│ │ └── blog/
│ │ ├── create.html
│ │ ├── index.html
│ │ └── update.html
│ └── static/
│ └── style.css
├── tests/
│ ├── conftest.py
│ ├── data.sql
│ ├── test_factory.py
│ ├── test_db.py
│ ├── test_auth.py
│ └── test_blog.py
├── venv/
├── setup.py
└── MANIFEST.in
flaskr/, a Python package containing your application code and files.
flaskr will contain the factory to generate flask app instances that can be used by WGSI servers and will work with tests, orms (for migrations) etc.
flaskr/__init__.py contains the factory method
The Factory
The factory is aimed at configuring and creating a Flask app. This means you need to pass all required configurations in one of the many ways accepted by Flask
The dev Flask server expects the function create_app() to be present in the package __init__.py file. But when using a production server like those listed in docs you can pass the name of the function to call.
A sample from the documentation:
# flaskr/__init__.py
import os
from flask import Flask
def create_app(test_config=None):
# create and configure the app
app = Flask(__name__, instance_relative_config=True)
app.config.from_mapping(
SECRET_KEY='dev',
DATABASE=os.path.join(app.instance_path, 'flaskr.sqlite'),
)
if test_config is None:
# load the instance config, if it exists, when not testing
app.config.from_pyfile('config.py', silent=True)
else:
# load the test config if passed in
app.config.from_mapping(test_config)
# ensure the instance folder exists
try:
os.makedirs(app.instance_path)
except OSError:
pass
# a simple page that says hello
#app.route('/hello')
def hello():
return 'Hello, World!'
return app
when running a dev flask server, set environment variables as described
$ export FLASK_APP=flaskr
$ export FLASK_ENV=development
$ flask run
The Routes
A Flask app may have multiple modules that require the App object for functioning, like #app.route as you mentioned in the comments. To handle this gracefully we can make use of Blueprints. This allows us to keep the routes in a differnt file and register them in create_app()
from flask import Blueprint, render_template, abort
from jinja2 import TemplateNotFound
simple_page = Blueprint('simple_page', __name__,
template_folder='templates')
#simple_page.route('/', defaults={'page': 'index'})
#simple_page.route('/<page>')
def show(page):
try:
return render_template(f'pages/{page}.html')
except TemplateNotFound:
abort(404)
and we can modify the create_app() to register blueprint as follows:
def create_app(test_config=None):
app = Flask(__name__, instance_relative_config=True)
# configure the app
.
.
.
from yourapplication.simple_page import simple_page
app.register_blueprint(simple_page)
return app
You will need to locally import the blueprint to avoid circular imports. But this is not graceful when having many blueprints. Hence we can create an init_blueprints(app) function in the blueprints package like
# flaskr/blueprints/__init__.py
from flaskr.blueprints.simple_page import simple_page
def init_blueprints(app):
with app.app_context():
app.register_blueprint(simple_page)
and modify create_app() as
from flaskr.blueprints import init_blueprints
def create_app(test_config=None):
app = Flask(__name__, instance_relative_config=True)
# configure the app
.
.
.
init_blueprints(app)
return app
this way your factory does not get cluttered with blueprints. And you can handle the registration of blueprints inside the blueprint package as per your choice. This also avoids circular imports.
Other Extensions
Most common flask extensions support the factory pattern that allows you to create an object of an extension and then call obj.init_app(app) to initialize it with Flask App. Takeing Marshmallow here as an exmaple, but it applies to all. Modify create_app() as so -
ma = Marshmallow()
def create_app(test_config=None):
app = Flask(__name__, instance_relative_config=True)
# configure the app
.
.
.
init_blueprints(app)
ma.init_app(app)
return app
you can now import ma from flaskr in which ever file required.
Production Server
As mentioned initialially, the production WSGI serevrs will call the create_app() to create instances of Flask.
using gunicorn as an example, but all supported WSGI servers can be used.
$ gunicorn "flaskr:create_app()"
You can pass configurations as per gunicorn docs, and the same can be achieved within a script too.

What I did was:
class App:
def __init__(self):
# Various other initialization (e.g. logging, config, ...)
...
self.webapp = self._start_webapp(self.app_name, self.app_port, self.log)
pass
def _start_webapp(self, app_name: str, app_port: Optional[int], log: logging):
log.info('Running webapp...')
webapp = Flask(app_name)
# other flask related code
...
webapp.run(debug=False, host='0.0.0.0', port=app_port)
return webapp
pass
if __name__ == '__main__':
app = App()
This way you can add optional parameters to the init to override during tests or override via config change and even create additional types of endpoints in the same application, if you need.

How can you split templates in mako in several files/directories?

I am trying to understand how to split a project which uses Mako and CherryPy in several directories. I have prepared the following directory structure:
[FOLDER] /home/user/myapp
|- main.py
|- app.config
|- server.config
[FOLDER] /home/user/myapp/templates
[FOLDER] /home/user/myapp/templates/base
|- index.html
|- sidebar_menu.html
[FOLDER] /home/user/myapp/config
|- templates.py
In /home/user/myapp/templates there will be the different templates organised in directories.
Under /home/user/myapp/config I have the following file: templates.py with the following code:
# -*- coding: utf-8 -*-
import mako.template
import mako.lookup
# Templates
templates_lookup = mako.lookup.TemplateLookup(
directories=[
'/templates',
'/templates/base',
],
module_directory='/tmp/mako_modules',
input_encoding='utf-8',
output_encoding='utf-8',
encoding_errors='replace'
)
def serve_template(templatename, **kwargs):
mytemplate = templates_lookup.get_template(templatename)
print(mytemplate.render(**kwargs))
Under /home/user/myapp there will be the following main.py file:
# -*- coding: utf-8 -*-
import os
import cherrypy
import mako.template
import mako.lookup
import config.templates
# Main Page
class Index(object):
#cherrypy.expose
def index(self):
t = config.templates.serve_template('index.html')
print(t)
return t
cherrypy.config.update("server.config")
cherrypy.tree.mount(Index(), '/', "app.config")
cherrypy.engine.start()
When I launch the application and access / I get the following message:
500 Internal Server Error
The server encountered an unexpected condition which prevented it from fulfilling the request.
Traceback (most recent call last):
File "C:\Python37\lib\site-packages\mako\lookup.py", line 247, in get_template
return self._check(uri, self._collection[uri])
KeyError: 'index.html'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Python37\lib\site-packages\cherrypy\_cprequest.py", line 628, in respond
self._do_respond(path_info)
File "C:\Python37\lib\site-packages\cherrypy\_cprequest.py", line 687, in _do_respond
response.body = self.handler()
File "C:\Python37\lib\site-packages\cherrypy\lib\encoding.py", line 219, in __call__
self.body = self.oldhandler(*args, **kwargs)
File "C:\Python37\lib\site-packages\cherrypy\_cpdispatch.py", line 54, in __call__
return self.callable(*self.args, **self.kwargs)
File ".....\myapp\main.py", line 18, in index
t = config.templates.serve_template('index.html')
File ".....\myapp\config\templates.py", line 19, in serve_template
mytemplate = templates_lookup.get_template(templatename)
File "C:\Python37\lib\site-packages\mako\lookup.py", line 261, in get_template
"Cant locate template for uri %r" % uri)
mako.exceptions.TopLevelLookupException: Cant locate template for uri 'index.html'
Powered by CherryPy 18.1.0
So basically it seems that Mako can not locate index.html despite we are providing the directories. I guess I am not understanding how Mako uses in the lookup.
Note: program is actually run in Windows, I used UNIX file structure above just to make the file structure easier to read.
Python 3.7.2
CherryPy 18.1.0
Mako 1.0.7

You state your directory structure is
/home/user/myapp/templates
but you're telling Mako to look in
/templates
Maybe change code to:
directories=[
'/home/user/myapp/templates',
'/home/user/myapp/templates/base',
],

I usually split the templates into per-page templates and global templates
example:
src/
├── constants.py
├── home
│   └── user
│   └── myapp
│   ├── app.config
│   ├── main.mako
│   ├── main.py
│   └── server.config
└── templates
├── e404.mako
├── e500.mako
├── footer.mako
└── header.mako
in this case, i'll always import a global file with the lookup dir
# src/constants.py
from mako.lookup import TemplateLookup
mylookup = TemplateLookup(directories=['.', 'dir/to/src/templates/'])
# home/user/myapp/main.py
from src.constants import mylookup
def main():
if i_have_errer:
template = mylookup.get_template('e500.mako')
else:
template = mylookup.get_template('main.mako')
return template.render_unicode()
the '.' will lookup first in current directory
the templates/ will lookup the global src/templates/ for a more generic templates

Can't import models in tasks.py with Celery + Django

I want to create a background task to update a record on a specific date. I'm using Django and Celery with RabbitMQ.
I've managed to get the task called when the model is saved with this dummy task function:
tasks.py
from __future__ import absolute_import
from celery import Celery
from celery.utils.log import get_task_logger
logger = get_task_logger(__name__)
app = Celery('tasks', broker='amqp://localhost//')
#app.task(name='news.tasks.update_news_status')
def update_news_status(news_id):
# (I pass the news id and return it, nothing complicated about it)
return news_id
this task is called from my save() method in my models.py
from django.db import models
from celery import current_app
class News(models.model):
(...)
def save(self, *args, **kwargs):
current_app.send_task('news.tasks.update_news_status', args=(self.id,))
super(News, self).save(*args, **kwargs)
Thing is I want to import my News model in tasks.py but if I try to like this:
from .models import News
I get this error :
django.core.exceptions.ImproperlyConfigured: Requested setting
DEFAULT_INDEX_TABLESPACE, but settings are not configured. You must
either define the environment variable DJANGO_SETTINGS_MODULE or call
settings.configure() before accessing settings.
This is how mi celery.py looks like
from __future__ import absolute_import, unicode_literals
from celery import Celery
import os
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'myapp.settings')
app = Celery('myapp')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
I have already tried this:
can't import django model into celery task
I have tried to make the import inside the task method Django and Celery, AppRegisteredNotReady exception
I have also tried this Celery - importing models in tasks.py
I also tried to create a utils.py and import it and was not possible.
and ran into different errors but in the end I'm not able to import any module in tasks.py
There might be something wrong with my config but I can't see the error, I followed the steps in The Celery Docs: First steps with Django
Also, my project structure looks like this:
├── myapp
│ ├── __init__.py
├── ├── celery.py
│ ├── settings.py
│ ├── urls.py
│ └── wsgi.py
├── news
│ ├── __init__.py
│ ├── admin.py
│ ├── apps.py
│ ├── tasks.py
│ ├── urls.py
│ ├── models.py
│ ├── views.py
├── manage.py
I'm executing the worker from myapp directory like this:
celery -A news.tasks worker --loglevel=info
What am I missing here? Thanks in advance for your help!
lambda: settings.INSTALLED_APPS
EDIT
After making the changes suggested in comments:
Add this to celery.py
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
and import inside method: tasks.py
from __future__ import absolute_import
from celery import Celery
from celery.utils.log import get_task_logger
logger = get_task_logger(__name__)
app = Celery('tasks', broker='amqp://localhost//')
#app.task(name='news.tasks.update_news_status')
def update_news_status(news_id):
from .models import News
return news_id
I get the following error:
[2018-07-20 12:24:29,337: ERROR/ForkPoolWorker-1] Task news.tasks.update_news_status[87f9ec92-c260-4ee9-a3bc-5f684c819f79] raised unexpected: ValueError('Attempted relative import in non-package',)
Traceback (most recent call last):
File "/Users/carla/Develop/App/backend/myapp-venv/lib/python2.7/site-packages/celery/app/trace.py", line 382, in trace_task
R = retval = fun(*args, **kwargs)
File "/Users/carla/Develop/App/backend/myapp-venv/lib/python2.7/site-packages/celery/app/trace.py", line 641, in __protected_call__
return self.run(*args, **kwargs)
File "/Users/carla/Develop/App/backend/news/tasks.py", line 12, in update_news_status
from .models import News
ValueError: Attempted relative import in non-package

Ok so for anyone struggling with this... turns out my celery.py wasn't reading env variables from the settings.
After a week and lots of research I realised that Celery is not a process of Django but a process running outside of it (duh), so when I tried to load the settings they were loaded but then I wasn't able to access the env variables I have defined in my .env ( I use the dotenv library). Celery was trying to look up for the env variables in my .bash_profile (of course)
So in the end my solution was to create a helper module in the same directory where my celery.py is defined, called load_env.py with the following
from os.path import dirname, join
import dotenv
def load_env():
"Get the path to the .env file and load it."
project_dir = dirname(dirname(__file__))
dotenv.read_dotenv(join(project_dir, '.env'))
and then on my celery.py (note the last import and first instruction)
from __future__ import absolute_import, unicode_literals
from celery import Celery
from django.conf import settings
import os
from .load_env import load_env
load_env()
# set the default Django settings module for the 'celery' program.
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "myapp.settings")
app = Celery('myapp')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('myapp.settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
after the call to load_env() env variables are loaded and the celery worker has access to them. By doing this I am now able to access other modules from my tasks.py, which was my main problem.
Credits to this guys (Caktus Consulting Group) and their django-project-template because if it wasn't for them I wouldn't find the answer. Thanks.

try something like this. its working in 3.1 celery, import should happen inside save method and after super()
from django.db import models
class News(models.model):
(...)
def save(self, *args, **kwargs):
(...)
super(News, self).save(*args, **kwargs)
from task import update_news_status
update_news_status.apply_async((self.id,)) #apply_async or delay

Here what i would do (Django 1.11 and celery 4.2), you have a problem in your celery config and you try to re-declare the Celery instance :
tasks.py
from myapp.celery import app # would contain what you need :)
from celery.utils.log import get_task_logger
logger = get_task_logger(__name__)
#app.task(name='news.tasks.update_news_status')
def update_news_status(news_id):
# (I pass the news id and return it, nothing complicated about it)
return news_id
celery.py
from __future__ import absolute_import, unicode_literals
from celery import Celery
from django.conf import settings
import os
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "myapp.settings")
app = Celery('myapp', backend='rpc://', broker=BROKER_URL) # your config here
app.config_from_object('django.myapp:settings', namespace='CELERY') # change here
app.autodiscover_tasks()
models.py
from django.db import models
class News(models.model):
(...)
def save(self, *args, **kwargs):
super(News, self).save(*args, **kwargs)
from news.tasks import update_news_status
update_news_status.delay(self.id) # change here
And launch it with celery -A myapp worker --loglevel=info because your app is defined in myapp.celery so -A parameter need to be the app where the conf is declared

Can't call Celery in Django

So the problem is that when I try to call main task from Django lpr.views.py page shows that loading icon and thats it, nothin else happens. There is no output in Django or Celery console. When I try and run the task from python shell it runs without a problem and saves result in db. I added add task for test purposes and when I run add task it returns an error because of missing 'y' argument which is normal. But what is up with that main task?
There is my code just in case.
Project structure:
Project
├── acpvs
│   ├── celery.py
│   ├── __init__.py
│   ├── settings.py
│   ├── urls.py
│   └── wsgi.py
├── db.sqlite3
├── lpr
│   ├── __init__.py
│   ├── tasks.py
│   ├── urls.py
│   └── views.py
└── manage.py
settings.py
import djcelery
INSTALLED_APPS = [
...
'djcelery',
'django_celery_results',
]
CELERY_BROKER_URL = 'redis://localhost:6379/0'
CELERY_RESULT_BACKEND = 'django-db'
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_ACCEPT_CONTENT = ['json']
CELERY_ALWAYS_EAGER = False
djcelery.setup_loader()
init.py
from __future__ import absolute_import, unicode_literals
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from acpvs.celery import app as celery_app
__all__ = ['celery_app']
acpvs.celery.py
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'acpvs.settings')
app = Celery('acpvs')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
lpr.tasks.py
from __future__ import absolute_import, unicode_literals
from celery import shared_task
from djcelery import celery
#shared_task
def add(x, y):
return x + y
#shared_task
def main():
...
args = {
'imageName': imageName,
'flag': True
}
return args
lpr.urls.py
from django.conf.urls import url
from . import views
urlpatterns = [
url(r'^t/$', views.test_add),
url(r'^t1/$', views.test_main),
]
lpr.views.py
from . import tasks
from django.http import HttpResponse
def test_add(request):
result = tasks.add.delay()
return HttpResponse(result.task_id)
der test_main(request):
result = tasks.main.delay()
return HttpResponse(result.task_id)
Update
It seems to me that there is still something wrong with that how I have integrated Celery. When I remowe .delay() from views.py it works but ofcourse not async and not using Celery.

delay() is actually execute the task asynchronously, Please confirm if it updating the values in db,
I think it will update the value in db (if main method is doing) but will not return the value since client adds a message to the queue, the broker then delivers that message to a worker which then perform the operation of main

So I got it working by removing all djcelery instances and upgrading Django from 1.11 to 2.0.3
By the way I'm using Celery 4.1.0

Can't use Jsonify in Flask websocket

Jsonify doesn't seem to work outside of an application context, is there a workaround?
I am replacing some ajax requests with websockets because it is needed for performances and network issues. I installed Flask-WebSocket with pip in my env. Now I get an error:
RuntimeError: working outside of application context
The skeleton of my application is as follows:
app/
├── forms
├── static
│ ├── css
│ ├── img
│ │ └── DefaultIcon
│ │ ├── eps
│ │ └── png
│ └── js
├── templates
├── ups
└── views
The websockets python files are located in views/ajax.py:
# -*- coding: utf-8 -*-
# OS Imports
import time
# Flask Imports
from flask import jsonify
from .. import sockets
from app.functions import get_cpu_load, get_disk_usage, get_vmem
# Local Imports
from app import app
from app.views.constants import info, globalsettings
#sockets.route('/_system')
def _system(ws):
"""
Returns the system informations, JSON Format
CPU, RAM, and Disk Usage
"""
while True:
message = ws.receive()
if message == "update":
cpu = round(get_cpu_load())
ram = round(get_vmem())
disk = round(get_disk_usage())
ws.send(jsonify(cpu=cpu, ram=ram, disk=disk)
I launch my application using this command:
gunicorn -k flask_sockets.worker app:app
Here is my __init__.py in app/ folder :
# -*- coding: utf-8 -*-
from flask import Flask
from flask_sockets import Sockets
app = Flask(__name__)
sockets = Sockets(app)
app.config.from_object('config')
from app import views as application
Why doesn't jsonify work, what can I use instead?

In flask
jsonify is a response that sends a response with data in json format.
you can do it like this:
import json
then change ws.send to:
ws.send(json.dumps(dict(cpu=cpu, ram=ram, disk=disk)))

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Celery, Django and Scrapy: error importing from django app - python

It seems the celery daemon is running using the system's python and not the python binary inside the virtualenv. You need to use # Python interpreter from environment. ENV_PYTHON="$CELERYD_CHDIR/env/bin/python" As mentioned here to tell celeryd to run using the python inside the virtualenv.

Related

How should the startup of a flask app be structured?

How can you split templates in mako in several files/directories?

Can't import models in tasks.py with Celery + Django

Can't call Celery in Django

Can't use Jsonify in Flask websocket

Categories

Resources