I'm trying to get the state of a task as follow :
__init__.py
celery = Celery(app.name,backend='amqp',broker=app.config['CELERY_BROKER_URL'])
celery.conf.update(app.config)
foo.py
class Foo(object):
def bar(self):
task = self._bar_async.apply_async()
return task.id
#celery.task(filter=task_method,bind=True)
def _bar_async(task,self):
for i in range(0,100):
task.update_state(state='PROGRESS',meta={'progress':i})
time.sleep(2)
taskstatus.py
def taskstatus(task_id):
task = celery.AsyncResult(id=task_id)
Is it the recommended way to use update_state with bind ?
Also when I try to get the state of the task using taskstatus, I always get NoneType for task. What is the problem ?
There are two issues in your code
Firstly, add an argument self to apply_async method
def bar(self):
task = self._bar_async.apply_async([self])
This change will fix the get NoneType for task issue. The reason is the task will be failed in worker, so you could not get the result.
Secondly, should use app.backend.get_result in taskstatus() to see the progress instead of AsyncResult since AsyncResult.get() will block until the task status become ready.
from apps import celery
app = celery.app
r = app.backend.get_result(task_id)
print r
Related
I would like to use Django signals to trigger a celery task like so:
def delete_content(sender, instance, **kwargs):
task_id = uuid()
task = delete_libera_contents.apply_async(kwargs={"instance": instance}, task_id=task_id)
task.wait(timeout=300, interval=2)
But I'm always running into kombu.exceptions.EncodeError: Object of type MusicTracks is not JSON serializable
Now I'm not sure how to tread MusicTracks instance as it's a model class instance. How can I properly pass such instances to my task?
At my tasks.py I have the following:
#app.task(name="Delete Libera Contents", queue='high_priority_tasks')
def delete_libera_contents(instance, **kwargs):
libera_backend = instance.file.libera_backend
...
Never send instance in celery task, you only should send variables for example instanse primary key and then inside of the celery task via this pk find this instance and then do your logic
your code should be like this:
views.py
def delete_content(sender, **kwargs):
task_id = uuid()
task = delete_libera_contents.apply_async(kwargs={"instance_pk": sender.pk}, task_id=task_id)
task.wait(timeout=300, interval=2)
task.py
#app.task(name="Delete Libera Contents", queue='high_priority_tasks')
def delete_libera_contents(instance_pk, **kwargs):
instance = Instance.ojbects.get(pk = instance_pk)
libera_backend = instance.file.libera_backend
...
you can find this rule in celery documentation (can't find link), one of
reasons imagine situation:
you send your instance to celery tasks (it is delayed for any reason for 5 min)
then your project makes logic with this instance, before your task finished
then celery's task time come and it uses this instance old version, and this instance become corrupted
(this is the reason as I think it is, not from the documentation)
First off, sorry for making the question a bit confusing, especially for the people that have already written an answer.
In my case, the delete_content signal can be trigger from three different models, so it actually looks like this:
#receiver(pre_delete, sender=MusicTracks)
#receiver(pre_delete, sender=Movies)
#receiver(pre_delete, sender=TvShowEpisodes)
def delete_content(sender, instance, **kwargs):
delete_libera_contents.delay(instance_pk=instance.pk)
So every time one of these models triggers a delete action, this signal will also trigger a celery task to actually delete the stuff in the background (all stored on S3).
As I cannot and should not pass instances around directly as pointed out by #oruchkin, I pass the instance.pk to the celery task which I then have to find in the celery task as I don't know in the celery task what model has triggered the delete action:
#app.task(name="Delete Libera Contents", queue='high_priority_tasks')
def delete_libera_contents(instance_pk, **kwargs):
if Movies.objects.filter(pk=instance_pk).exists():
instance = Movies.objects.get(pk=instance_pk)
elif MusicTracks.objects.filter(pk=instance_pk).exists():
instance = MusicTracks.objects.get(pk=instance_pk)
elif TvShowEpisodes.objects.filter(pk=instance_pk).exists():
instance = TvShowEpisodes.objects.get(pk=instance_pk)
else:
raise logger.exception("Task: 'Delete Libera Contents', reports: No instance found (code: JFN4LK) - Warning")
libera_backend = instance.file.libera_backend
You might ask why do you not simply pass the sender from the signal to the celery task. I also tried this and again, as already pointed out, I cannot pass instances and I fail with:
kombu.exceptions.EncodeError: Object of type ModelBase is not JSON serializable
So it really seems I have to hard obtain the instance using the if-elif-else clauses at the celery task.
I am new to Python and Celery-Redis, so Please correct me if my understanding is incorrect.
I have been debugging a code base Which has structure like -
TaskClass -> Celery Task
HandlerClass1, HandlerClass2 -> These are python classes extending Object class
The application creates TaskClass say dumyTask instance and dumyTask creates celery subtasks(I believe these subtasks are unique) say dumySubTask1, dumySubTask2 by taking signatures of handler.
What I am not able to understand?
1) How does celery manages the results of dumySubTask1, dumySubTask2 and dumyTask? I mean the results of dumySubTask1 and dumySubTask2 should be aggregated and given as result of dumyTask. How does Celery-Redis manage this?
2) once the task is executed how does celery stores tasks results in backend? I mean will the result of dumySubTask1 and dumySubTask2 be stored in backend and then results returned to dumyTask and then dumyTask return results to QUEUE(Please correct if I am wrong)?
3) Does Celery maintains Tasks and subtasks as STACK? Please see snapshot.Task-SubTask Tree
Any guidance is highly appreciated. Thanks.
celery worker can invoke 'tasks' . This 'task' can have 'subtasks' which can be 'chained' together i.e invokes sequentially. 'chain' is the term specifically used in celery canvas guides. The result is then returned to the queue in redis.
celery worker are use to invoke 'independent tasks' mostly used for 'network use cases' i.e 'sending email','hitting url'
You need to get it from the celery instance with
task = app_celery.AsyncResult(task_id)
Full example - below
My celery_worker.py file is:
import os
import time
from celery import Celery
from dotenv import load_dotenv
load_dotenv(".env")
celery = Celery(__name__)
celery.conf.broker_url = os.environ.get("CELERY_BROKER_URL")
celery.conf.result_backend = os.environ.get("CELERY_RESULT_BACKEND")
#celery.task(name="create_task")
def create_task(a, b, c):
print(f"Executing create_task it will take {a}")
[print(i) for i in range(100)]
time.sleep(a)
return b + c
I'm using FastAPI my endpoints are:
# To execute the task
#app.get("/sum")
async def root(sleep_time: int, first_number: int, second_number: int):
process = create_task.delay(sleep_time, first_number, second_number)
return {"process_id": process.task_id, "result": process.result}
# To get the task status and result
from celery_worker import create_task, celery
#app.get("/task/{task_id}")
async def check_task_status(task_id: str):
task = celery.AsyncResult(task_id)
return {"status": task.status, "result": task.result}
My .env file has:
CELERY_BROKER_URL=redis://redis:6379/0
CELERY_RESULT_BACKEND=redis://redis:6379/0
I am using celery to execute my asynchronous tasks and what i'm trying to achieve is get the name and the id of each task in the work flow after i executed it.
exec_workflow = chain(
task1.si(),
task2.si(),
task3.si()
)
result = exec_workflow.apply_async()
tasks = []
for t in result._parents():
tasks.append({"id": t.id, "name": t.name})
but it seems like AsyncResult does not have the name property for some strange reason. any idea on what would be the appropriate way to do this?
A different approach to this maybe to force an id on each task before i execute apply_async and this would solve my problem cause i will be able to match id to task name. but i'm not sure if its possible.
Thanks.
Not the best solution but it works.
result = signature.apply_async()
result._cache['task_name']
#'procedures.tasks.stop'
There is a configuration option result_extended in Celery for this purpose (it is set to False by default).
Enables extended task result attributes (name, args, kwargs, worker, retries, queue, delivery_info) to be written to backend.
Ref.:
https://docs.celeryproject.org/en/master/userguide/configuration.html#result-extended
Consumer example (Worker)
from typing import Final
from celery import Celery
app: Final = Celery(
broker="amqp://...",
result_backend="redis://...",
result_extended=True,
)
#app.task(
name="foo-service:bar"
)
def _() -> int:
return 42
Producer example (Client)
from pprint import pprint
from typing import Final
from celery import Celery
from celery.result import AsyncResult
app: Final = Celery(broker="amqp://...", result_backend="redis://...")
result: AsyncResult = app.send_task("foo-service:bar")
assert result.get() == 42
assert result.name == "foo-service:bar"
assert result.queue == ...
assert result.args == ...
assert result.kwargs == ...
assert result.worker == ...
pprint(result.__dict__)
Alright so I've solved my problem. What i did eventually was to just set the id property of each task.
I'm trying to create some celery tasks as classes, but am having some difficulty. The classes are:
class BaseCeleryTask(app.Task):
def is_complete(self):
""" default method for checking if celery task has completed. """
# simply return result (since by default tasks return boolean indicating completion)
try:
return self.result
except AttributeError:
logger.error('Result not defined. Make sure task has run!')
return False
class MacroReportTask(BaseCeleryTask):
def run(self, params):
""" Override the default run method with signal factory run"""
# hold on to the factory
process = MacroCountryReport(params)
self.result = process.run()
return self.result
but when I initialize the app, and check app.tasks (or run worker), app doesn't seem to have these above tasks in its registry. Other function based tasks (using app.task() decorator) seem to be registered fine.
I run the above task as:
process = SignalFactoryTask()
process.delay(params)
Celery worker errors with the following message:
Received unregistered task of type None.
I think the issue I'm having is: how do I add custom classes to the task registry as I do with regular function based tasks?
Ran into the exact same issue, took hours to find the solution cause I'm 90% sure it's a bug. In your class tasks, try the following
class BaseCeleryTask(app.Task):
def __init__(self):
self.name = "[modulename].BaseCeleryTask"
class MacroReportTask(app.Task):
def __init__(self):
self.name = "[modulename].MacroReportTask"
It seems registering it with the app still has a bug where the name isn't automatically configured. Let me know if that works.
# get function like this:
#gen.coroutine
def get(self, url=None):
if not url:
url = "https://www.baidu.com/"
res = yield self.client.fetch(url)
raise gen.Return(res.body)
# add_job method:
self.sdu.add_job(
tornado.ioloop.IOLoop.instance().add_callback,
'interval',
seconds=delta_time,
args=[get],
)
I start apscheduler in Tornado Application:
self.sdu = scheduler.SchedulerWrapper()
self.sdu.start()
and the error log is:
ValueError:
This Job cannot be serialized since the reference to its callable
(>) could
not be determined. Consider giving a textual reference
(module:function name) instead.
don't know how to solve this problem, asking for ur help
You are trying to add a bound method as the target function of a job. That will not work with a persistent job store. Instead, create a new function as the target which will then get the global IOLoop and start the coroutine.