How can I test on_failure in celery - python

My celery task has a base class were an on_failure method is implemented.
In my test, I patched one of the methods that the task is calling to, to raise an exception but on_faliure is never called.
Base class
class BaseTask(celery.Task):
abstract = True
def on_failure(self, exc, task_id, args, kwargs, einfo):
print("error")
Task
#celery.task(bind=True, base=BaseTask)
def multiple(self, a, b):
logic.method(a, b)
Test
#patch('tasks.logic.method')
def test_something(self, mock):
# arrange
mock.side_effect = NotImplementedError
# act
with self.assertRaises(NotImplementedError):
multiple(1, 2)
When running celery and an exception is raised everything works fine.
CELERY_ALWAYS_EAGER is activated.
how can I make on_faliure run?

From a discussion on a issue in celery GitHub: on_failure test is "already done on the Celery level (verifying if on_failure is called)" and "write a test to test whatever your on_failure does instead". You could define a function inside the on_failure method and test it, or call on_failure like a classmethod:
import TestCase
from billiard.einfo import ExceptionInfo
class TestTask(TestCase):
def test_on_failure(self):
"Testing on failure method"
exc = Exception("Test")
task_id = "test_task_id"
args = ["argument 1"]
kwargs = {"key": "value"}
einfo = ExceptionInfo
# call on_failure method
multiple.on_failure(exc, task_id, args, kwargs, einfo)
# assert something appened
ExceptionInfo is the same type of object celery uses; multiple is your task as you defined it in your question.
Hope this helps

Related

celery: get function name by task id?

I am using celery on_failure handler to logging all failed tasks for debugging and analysis. And I want to know the task name(function name) of the failed task, how can I get that?
from celery import Task
class DebugTask(Task):
abstract = True
def after_return(self, *args, **kwargs):
print('Task returned: {0!r}'.format(self.request))
def on_failure(self, exc, task_id, args, kwargs, einfo):
func_name = get_func_name_by_task_id(task_id) # how do I do this?
print "{} failed".format(func_name) # expected out: add failed.
#app.task(base=DebugTask)
def add(x, y):
return x + y
PS: I know there is task_id, but query function name by task_id every time is not fun,
Quick look at the documentation shows Task.name.

Why does Celery NOT throw an Exception when the underlying task throws one

Celery doesn't seem to be handling exceptions properly.
If I have task:
def errorTest():
raise Exception()
and then I call
r = errorTest.delay()
In [8]: r.result
In [9]: r.state
Out[9]: 'PENDING'
And it will hang like this indefinitely.
Going and checking the logs shows that the error IS getting thrown in the task (and if you want the message, ask), and I know that the backend and everything is set up properly because other tasks just work and return results correctly.
Is there something funky that I need to do to catch exceptions in Celery?
/Celery version is 3.0.13, broker is RabbitMQ running on my local machine
If you are running Celery with the CELERY_ALWAYS_EAGER set to True, then make sure you include this line in your settings too:
CELERY_EAGER_PROPAGATES_EXCEPTIONS = True
http://docs.celeryproject.org/en/latest/configuration.html#celery-eager-propagates-exceptions
You can define an on_failure function in your Task subclass to handle them correctly. If you're just looking to find out what happened you can setup error email notifications that will send you the stack trace in your celery config.
Note: As of v4 Celery no longer supports sending emails.
Going to make #primalpython's answer more explicit.
This will fail:
#task
def error():
raise Exception
Input/Output:
In [7]: r = error.delay()
In [8]: print r.state
Out[8]: 'PENDING'
In [9]: print r.result
Out[9]: None
This will succeed:
#task
def error():
raise Exception
def on_failure(self, *args, **kwargs):
pass
Input/Output:
In [7]: r = error.delay()
In [8]: print r.state
Out[8]: 'FAILURE'
In [9]: print r.result
Out[9]: Exception()
IMO the easiest way to do this is to pass in a reference to a task class to use when you create your new Celery application.
In one module define the task class to use by default:
from celery.app.task import Task
import logging
logger=logging.getLogger(__name__)
class LoggingTask(Task):
def on_failure(self, exc, task_id, args, kwargs, einfo):
kwargs={}
if logger.isEnabledFor(logging.DEBUG):
kwargs['exc_info']=exc
logger.error('Task % failed to execute', task_id, **kwargs)
super().on_failure(exc, task_id, args, kwargs, einfo)
When you define your app, reference the module (note, it's a string reference you provide..):
from celery import Celery
app=Celery('my_project_name', task_cls='task_package.module_name:LoggingTask')
From that point forward, if no task class is specifically provided, the LoggingTask will be used - thereby allowing you to effect all existing tasks (that use the default) rather than having to modify each one. This also means you can use the #shared_task decorator as normal..

Celery task with multiple decorators not auto registering task name

I'm having a task that looks like this
from mybasetask_module import MyBaseTask
#task(base=MyBaseTask)
#my_custom_decorator
def my_task(*args, **kwargs):
pass
and my base task looks like this
from celery import task, Task
class MyBaseTask(Task):
abstract = True
default_retry_delay = 10
max_retries = 3
acks_late = True
The problem I'm running into is that the celery worker is registering the task with the name
'mybasetask_module.__inner'
The task is registerd fine (which is the package+module+function) when I remove #my_custom_decorator from the task or if I provide an explicit name to the task like this
from mybasetask_module import MyBaseTask
#task(base=MyBaseTask, name='an_explicit_task_name')
#my_custom_decorator
def my_task(*args, **kwargs):
pass
Is this behavior expected? Do I need to do something so that my tasks are registered with the default auto registered name in the first case when I have multiple decorators but no explicit task name?
Thanks,
Use the functools.wraps() decorator to ensure that the wrapper returned by my_custom_decorator has the correct name:
from functools import wraps
def my_custom_decorator(func):
#wraps(func)
def __inner():
return func()
return __inner
The task name is taken from the function call that the task decorator wraps, but by inserting a decorator in between, you gave task your __inner wrapping function instead. The functools.wraps() decorator copies all the necessary metadata over from func to the wrapper so that task() can pick up the proper name.

GAE: Unit testing DeadlineExceededError

I've been using testbed, webtest, and nose to test my Python GAE app, and it is a great setup. I'm now implementing something similar to Nick's great example of using the deferred library, but I can't figure out a good way to test the parts of the code triggered by DeadlineExceededError.
Since this is in the context of a taskqueue, it would be painful to construct a test that took more than 10 minutes to run. Is there a way to temporarily set the taskqueue time limit to a few seconds for the purpose of testing? Or perhaps some other way to elegantly test the execution of code in the except DeadlineExceededError block?
Abstract the "GAE context" for your code. in production provide real "GAE implementation" for testing provide a mock own that will raise the DeadlineExceededError. The test should not depend on any timeout, should be fast.
Sample abstraction (just glue):
class AbstractGAETaskContext(object):
def task_spired(): pass # this will throw exception in mock impl
# here you define any method that you call into GAE, to be mocked
def defered(...): pass
If you don't like abstraction, you can do monkey patching for testing only, also you need to define the task_expired function to be your hook for testing.
task_expired should be called during your task implementation function.
*UPDATED*This the 3rd solution:
First I want to mention that the Nick's sample implementation is not so great, the Mapper class has to many responsabilities(deferring, query data, update in batch); and this make the test hard to made, a lot of mocks need to be defined. So I extract the deferring responsabilities in a separate class. You only want to test that deferring mechanism, what actually is happen(the update, query, etc) should be handled in other test.
Here is deffering class, also this no more depends on GAE:
class DeferredCall(object):
def __init__(self, deferred):
self.deferred = deferred
def run(self, long_execution_call, context, *args, **kwargs):
''' long_execution_call should return a tuple that tell us how was terminate operation, with timeout and the context where was abandoned '''
next_context, timeouted = long_execution_call(context, *args, **kwargs)
if timeouted:
self.deferred(self.run, next_context, *args, **kwargs)
Here is the test module:
class Test(unittest.TestCase):
def test_defer(self):
calls = []
def mock_deferrer(callback, *args, **kwargs):
calls.append((callback, args, kwargs))
def interrupted(self, context):
return "new_context", True
d = DeferredCall()
d.run(interrupted, "init_context")
self.assertEquals(1, len(calls), 'a deferred call should be')
def test_no_defer(self):
calls = []
def mock_deferrer(callback, *args, **kwargs):
calls.append((callback, args, kwargs))
def completed(self, context):
return None, False
d = DeferredCall()
d.run(completed, "init_context")
self.assertEquals(0, len(calls), 'no deferred call should be')
How will look the Nick's Mapper implementation:
class Mapper:
...
def _continue(self, start_key, batch_size):
... # here is same code, nothing was changed
except DeadlineExceededError:
# Write any unfinished updates to the datastore.
self._batch_write()
# Queue a new task to pick up where we left off.
##deferred.defer(self._continue, start_key, batch_size)
return start_key, True ## make compatible with DeferredCall
self.finish()
return None, False ## make it comaptible with DeferredCall
runner = _continue
Code where you register the long running task; this only depend on the GAE deferred lib.
import DeferredCall
import PersonMapper # this inherits the Mapper
from google.appengine.ext import deferred
mapper = PersonMapper()
DeferredCall(deferred).run(mapper.run)

celery task and customize decorator

I'm working on a project using django and celery(django-celery). Our team decided to wrap all data access code within (app-name)/manager.py(NOT wrap into Managers like the django way), and let code in (app-name)/task.py only dealing with assemble and perform tasks with celery(so we don't have django ORM dependency in this layer).
In my manager.py, I have something like this:
def get_tag(tag_name):
ctype = ContentType.objects.get_for_model(Photo)
try:
tag = Tag.objects.get(name=tag_name)
except ObjectDoesNotExist:
return Tag.objects.none()
return tag
def get_tagged_photos(tag):
ctype = ContentType.objects.get_for_model(Photo)
return TaggedItem.objects.filter(content_type__pk=ctype.pk, tag__pk=tag.pk)
def get_tagged_photos_count(tag):
return get_tagged_photos(tag).count()
In my task.py, I like to wrap them into tasks (then maybe use these tasks to do more complicated tasks), so I write this decorator:
import manager #the module within same app containing data access functions
class mfunc_to_task(object):
def __init__(mfunc_type='get'):
self.mfunc_type = mfunc_type
def __call__(self, f):
def wrapper_f(*args, **kwargs):
callback = kwargs.pop('callback', None)
mfunc = getattr(manager, f.__name__)
result = mfunc(*args, **kwargs)
if callback:
if self.mfunc_type == 'get':
subtask(callback).delay(result)
elif self.mfunc_type == 'get_or_create':
subtask(callback).delay(result[0])
else:
subtask(callback).delay()
return result
return wrapper_f
then (still in task.py):
##task
#mfunc_to_task()
def get_tag():
pass
##task
#mfunc_to_task()
def get_tagged_photos():
pass
##task
#mfunc_to_task()
def get_tagged_photos_count():
pass
Things work fine without #task.
But, after applying that #task decorator(to the top as celery documentation instructed), things just start to fall apart. Apparently, every time the mfunc_to_task.__call__ gets called, the same task.get_tag function gets passed as f. So I ended up with the same wrapper_f every time, and now the only thing I cat do is to get a single tag.
I'm new to decorators. Any one can help me understand what went wrong here, or point out other ways to achieve the task? I really hate to write the same task wrap code for every of my data access functions.
Not quite sure why passing arguments won't work?
if you use this example:
#task()
def add(x, y):
return x + y
lets add some logging to the MyCoolTask:
from celery import task
from celery.registry import tasks
import logging
import celery
logger = logging.getLogger(__name__)
class MyCoolTask(celery.Task):
def __call__(self, *args, **kwargs):
"""In celery task this function call the run method, here you can
set some environment variable before the run of the task"""
logger.info("Starting to run")
return self.run(*args, **kwargs)
def after_return(self, status, retval, task_id, args, kwargs, einfo):
#exit point of the task whatever is the state
logger.info("Ending run")
pass
and create an extended class (extending MyCoolTask, but now with arguments):
class AddTask(MyCoolTask):
def run(self,x,y):
if x and y:
result=add(x,y)
logger.info('result = %d' % result)
return result
else:
logger.error('No x or y in arguments')
tasks.register(AddTask)
and make sure you pass the kwargs as json data:
{"x":8,"y":9}
I get the result:
[2013-03-05 17:30:25,853: INFO/MainProcess] Starting to run
[2013-03-05 17:30:25,855: INFO/MainProcess] result = 17
[2013-03-05 17:30:26,739: INFO/MainProcess] Ending run
[2013-03-05 17:30:26,741: INFO/MainProcess] Task iamscheduler.tasks.AddTask[6a62641d-16a6-44b6-a1cf-7d4bdc8ea9e0] succeeded in 0.888684988022s: 17
Instead of use decorator why you don't create a base class that extend celery.Task ?
In this way all your tasks can extend your customized task class, where you can implement your personal behavior by using methods __call__ and after_return
.
You can also define common methods and object for all your task.
class MyCoolTask(celery.Task):
def __call__(self, *args, **kwargs):
"""In celery task this function call the run method, here you can
set some environment variable before the run of the task"""
return self.run(*args, **kwargs)
def after_return(self, status, retval, task_id, args, kwargs, einfo):
#exit point of the task whatever is the state
pass

Categories