I'm trying to get a function to work in my django project with celerybeat that imports a class based function from a wrapper library. I've been reading that celery doesn't work with classes too easily. my function login_mb doesn't take an argument but when I try register and call this task I get an error Couldn't apply scheduled task login_mb: login_mb() takes 0 positional arguments but 1 was given
Is this because of self in the wrapper function imported?
What could I do to get this to work with celerybeat?
settings.py
CELERY_BEAT_SCHEDULE = {
'login_mb': {
'task': 'backend.tasks.login_mb',
'schedule': timedelta(minutes=30),
} ,
tasks.py
from matchbook.apiclient import APIClient
import logging
from celery import task
log = logging.getLogger(__name__)
#shared_task(bind=True)
def login_mb():
mb = APIClient('abc', '123')
mb.login()
mb.keep_alive()
apiclient.py (wrapper library)
from matchbook.baseclient import BaseClient
from matchbook import endpoints
class APIClient(BaseClient):
def __init__(self, username, password=None):
super(APIClient, self).__init__(username, password)
self.login = endpoints.Login(self)
self.keep_alive = endpoints.KeepAlive(self)
self.logout = endpoints.Logout(self)
self.betting = endpoints.Betting(self)
self.account = endpoints.Account(self)
self.market_data = endpoints.MarketData(self)
self.reference_data = endpoints.ReferenceData(self)
self.reporting = endpoints.Reporting(self)
def __repr__(self):
return '<APIClient [%s]>' % self.username
def __str__(self):
return 'APIClient'
The error is not related to your wrapper library, there seems to be nothing wrong with your task.
The problem arises because you've defined your task with bind=True When done so, celery automatillca injects a parameter to the method containing information about the current task. So you can either remove bind=True, or add a parameter to your task method like so:
#shared_task(bind=True)
def login_mb(self):
mb = APIClient('abc', '123')
mb.login()
mb.keep_alive()
I am using celery to execute my asynchronous tasks and what i'm trying to achieve is get the name and the id of each task in the work flow after i executed it.
exec_workflow = chain(
task1.si(),
task2.si(),
task3.si()
)
result = exec_workflow.apply_async()
tasks = []
for t in result._parents():
tasks.append({"id": t.id, "name": t.name})
but it seems like AsyncResult does not have the name property for some strange reason. any idea on what would be the appropriate way to do this?
A different approach to this maybe to force an id on each task before i execute apply_async and this would solve my problem cause i will be able to match id to task name. but i'm not sure if its possible.
Thanks.
Not the best solution but it works.
result = signature.apply_async()
result._cache['task_name']
#'procedures.tasks.stop'
There is a configuration option result_extended in Celery for this purpose (it is set to False by default).
Enables extended task result attributes (name, args, kwargs, worker, retries, queue, delivery_info) to be written to backend.
Ref.:
https://docs.celeryproject.org/en/master/userguide/configuration.html#result-extended
Consumer example (Worker)
from typing import Final
from celery import Celery
app: Final = Celery(
broker="amqp://...",
result_backend="redis://...",
result_extended=True,
)
#app.task(
name="foo-service:bar"
)
def _() -> int:
return 42
Producer example (Client)
from pprint import pprint
from typing import Final
from celery import Celery
from celery.result import AsyncResult
app: Final = Celery(broker="amqp://...", result_backend="redis://...")
result: AsyncResult = app.send_task("foo-service:bar")
assert result.get() == 42
assert result.name == "foo-service:bar"
assert result.queue == ...
assert result.args == ...
assert result.kwargs == ...
assert result.worker == ...
pprint(result.__dict__)
Alright so I've solved my problem. What i did eventually was to just set the id property of each task.
I am migrating a project to Django and like to use the django-rq module.
However, I am stuck at what to put here:
import django_rq
queue = django_rq.get_queue('high')
queue.enqueue(func, foo, bar=baz)
How to call func ? Can this be a string like path.file.function ?
Does the function need to reside in the same file?
Create tasks.py file to include
from django_rq import job
#job("high", timeout=600) # timeout is optional
def your_func():
pass # do some logic
and then in your code
import django_rq
from tasks import your_func
queue = django_rq.get_queue('high')
queue.enqueue(your_func, foo, bar=baz)
I have the following code that starts a celery chain by visiting a url. The chain arguments are passed through a query parameter like: /process_pipeline/?pipeline=task_a|task_c|task_b. In order to avoid launching several similar chained tasks (if for instance someone refresh the page) I use a simple cache locking system.
I have a timeout on the cache, but what I'm missing here is a way to release the cache when the chain has commpleted.
Any idea?
tasks.py
from __future__ import absolute_import
from celery import shared_task
registry = {}
def register(fn):
registry[fn.__name__] = fn
#shared_task
def task_a(*args, **kwargs):
print('task a')
#shared_task
def task_b(*args, **kwargs):
print('task b')
#shared_task
def task_c(*args, **kwargs):
print('task c')
register(task_a)
register(task_b)
register(task_c)
views.py
from __future__ import absolute_import
from django.core.cache import cache as memcache
from django.shortcuts import redirect
from django.utils.hashcompat import md5_constructor as md5
from celery import chain
from .tasks import registry
LOCK_EXPIRE = 60 * 5 # Lock expires in 5 minutes
def process_pipeline(request):
pipeline = request.GET.get('pipeline')
hexdigest = md5(pipeline).hexdigest()
lock_id = 'lock-{0}'.format(hexdigest)
# cache.add fails if if the key already exists
acquire_lock = lambda: memcache.add(lock_id, None, LOCK_EXPIRE)
# memcache delete is very slow, but we have to use it to take
# advantage of using add() for atomic locking
release_lock = lambda: memcache.delete(lock_id)
if acquire_lock():
args = [registry[p].s() for p in pipeline.split('|')]
task = chain(*args).apply_async()
memcache.set(lock_id, task.id)
return redirect('celery-task_status', task_id=task.id)
else:
task_id = memcache.get(lock_id)
return redirect('celery-task_status', task_id=task_id)
from django.conf.urls import patterns, url
urls.py
urlpatterns = patterns('aafilters.views',
url(r'^process_pipeline/$', 'process_pipeline', name="process_pipeline"),
)
I have never used it, but I think you should take a look at Celery Canvas. It seems to be what you want.
The Celery documentation mentions testing Celery within Django but doesn't explain how to test a Celery task if you are not using Django. How do you do this?
It is possible to test tasks synchronously using any unittest lib out there. I normaly do 2 different test sessions when working with celery tasks. The first one (as I'm suggesting bellow) is completely synchronous and should be the one that makes sure the algorithm does what it should do. The second session uses the whole system (including the broker) and makes sure I'm not having serialization issues or any other distribution, comunication problem.
So:
from celery import Celery
celery = Celery()
#celery.task
def add(x, y):
return x + y
And your test:
from nose.tools import eq_
def test_add_task():
rst = add.apply(args=(4, 4)).get()
eq_(rst, 8)
Here is an update to my seven years old answer:
You can run a worker in a separate thread via a pytest fixture:
https://docs.celeryq.dev/en/v5.2.6/userguide/testing.html#celery-worker-embed-live-worker
According to the docs, you should not use "always_eager" (see the top of the page of the above link).
Old answer:
I use this:
with mock.patch('celeryconfig.CELERY_ALWAYS_EAGER', True, create=True):
...
Docs: https://docs.celeryq.dev/en/3.1/configuration.html#celery-always-eager
CELERY_ALWAYS_EAGER lets you run your task synchronously, and you don't need a celery server.
Depends on what exactly you want to be testing.
Test the task code directly. Don't call "task.delay(...)" just call "task(...)" from your unit tests.
Use CELERY_ALWAYS_EAGER. This will cause your tasks to be called immediately at the point you say "task.delay(...)", so you can test the whole path (but not any asynchronous behavior).
For those on Celery 4 it's:
#override_settings(CELERY_TASK_ALWAYS_EAGER=True)
Because the settings names have been changed and need updating if you choose to upgrade, see
https://docs.celeryproject.org/en/latest/history/whatsnew-4.0.html?highlight=what%20is%20new#lowercase-setting-names
unittest
import unittest
from myproject.myapp import celeryapp
class TestMyCeleryWorker(unittest.TestCase):
def setUp(self):
celeryapp.conf.update(CELERY_ALWAYS_EAGER=True)
py.test fixtures
# conftest.py
from myproject.myapp import celeryapp
#pytest.fixture(scope='module')
def celery_app(request):
celeryapp.conf.update(CELERY_ALWAYS_EAGER=True)
return celeryapp
# test_tasks.py
def test_some_task(celery_app):
...
Addendum: make send_task respect eager
from celery import current_app
def send_task(name, args=(), kwargs={}, **opts):
# https://github.com/celery/celery/issues/581
task = current_app.tasks[name]
return task.apply(args, kwargs, **opts)
current_app.send_task = send_task
As of Celery 3.0, one way to set CELERY_ALWAYS_EAGER in Django is:
from django.test import TestCase, override_settings
from .foo import foo_celery_task
class MyTest(TestCase):
#override_settings(CELERY_ALWAYS_EAGER=True)
def test_foo(self):
self.assertTrue(foo_celery_task.delay())
Since Celery v4.0, py.test fixtures are provided to start a celery worker just for the test and are shut down when done:
def test_myfunc_is_executed(celery_session_worker):
# celery_session_worker: <Worker: gen93553#mymachine.local (running)>
assert myfunc.delay().wait(3)
Among other fixtures described on http://docs.celeryproject.org/en/latest/userguide/testing.html#py-test, you can change the celery default options by redefining the celery_config fixture this way:
#pytest.fixture(scope='session')
def celery_config():
return {
'accept_content': ['json', 'pickle'],
'result_serializer': 'pickle',
}
By default, the test worker uses an in-memory broker and result backend. No need to use a local Redis or RabbitMQ if not testing specific features.
reference
using pytest.
def test_add(celery_worker):
mytask.delay()
if you use flask, set the app config
CELERY_BROKER_URL = 'memory://'
CELERY_RESULT_BACKEND = 'cache+memory://'
and in conftest.py
#pytest.fixture
def app():
yield app # Your actual Flask application
#pytest.fixture
def celery_app(app):
from celery.contrib.testing import tasks # need it
yield celery_app # Your actual Flask-Celery application
In my case (and I assume many others), all I wanted was to test the inner logic of a task using pytest.
TL;DR; ended up mocking everything away (OPTION 2)
Example Use Case:
proj/tasks.py
#shared_task(bind=True)
def add_task(self, a, b):
return a+b;
tests/test_tasks.py
from proj import add_task
def test_add():
assert add_task(1, 2) == 3, '1 + 2 should equal 3'
but, since shared_task decorator does a lot of celery internal logic, it isn't really a unit tests.
So, for me, there were 2 options:
OPTION 1: Separate internal logic
proj/tasks_logic.py
def internal_add(a, b):
return a + b;
proj/tasks.py
from .tasks_logic import internal_add
#shared_task(bind=True)
def add_task(self, a, b):
return internal_add(a, b);
This looks very odd, and other than making it less readable, it requires to manually extract and pass attributes that are part of the request, for instance the task_id in case you need it, which make the logic less pure.
OPTION 2: mocks
mocking away celery internals
tests/__init__.py
# noinspection PyUnresolvedReferences
from celery import shared_task
from mock import patch
def mock_signature(**kwargs):
return {}
def mocked_shared_task(*decorator_args, **decorator_kwargs):
def mocked_shared_decorator(func):
func.signature = func.si = func.s = mock_signature
return func
return mocked_shared_decorator
patch('celery.shared_task', mocked_shared_task).start()
which then allows me to mock the request object (again, in case you need things from the request, like the id, or the retries counter.
tests/test_tasks.py
from proj import add_task
class MockedRequest:
def __init__(self, id=None):
self.id = id or 1
class MockedTask:
def __init__(self, id=None):
self.request = MockedRequest(id=id)
def test_add():
mocked_task = MockedTask(id=3)
assert add_task(mocked_task, 1, 2) == 3, '1 + 2 should equal 3'
This solution is much more manual, but, it gives me the control I need to actually unit test, without repeating myself, and without losing the celery scope.
I see a lot of CELERY_ALWAYS_EAGER = true in unit tests methods as a solution for unit tests, but since the version 5.0.5 is available there are a lot of changes which makes most of the old answers deprecated and for me a time consuming nonsense, so for everyone here searching a Solution, go to the Doc and read the well documented unit test examples for the new Version:
https://docs.celeryproject.org/en/stable/userguide/testing.html
And to the Eager Mode with Unit Tests, here a quote from the actual docs:
Eager mode
The eager mode enabled by the task_always_eager setting is by
definition not suitable for unit tests.
When testing with eager mode you are only testing an emulation of what
happens in a worker, and there are many discrepancies between the
emulation and what happens in reality.
Another option is to mock the task if you do not need the side effects of running it.
from unittest import mock
#mock.patch('module.module.task')
def test_name(self, mock_task): ...