How to provide user constant notification about Celery's Task execution status? - python

I integrated my project with celery in this way, inside views.py after receving request from the user
def upload(request):
if "POST" == request.method:
# save the file
task_parse.delay()
# continue
and in tasks.py
from __future__ import absolute_import
from celery import shared_task
from uploadapp.main import aunit
#shared_task
def task_parse():
aunit()
return True
In short, the shared task will run a function aunit() from a third python file located in uploadapp/ directory named main.py.
Let's assume that aunit() is a resource heavy process which takes time (like file parsing). As I integrated that with celery, It works totally asynchronously now which is good to me. So, the task start -> Celery process -> It finishes then celery set status to Finish. I can view that using flower .
But what I want to do is that I want to notify the user who is using my app also through django UI that Your Task is done processing as soon as Celery has finished processing at back-side and set status to SUCCESS.
Now, I know this is possible if :
1.) I constantly request the STATUS and see wheather it returns SUCCESS or not.
How do I do that via Celery. How can you query Celery Task status from your views.py and notify user asynchronously with just celery's python module ?

You need a real time mechanism. I would suggest Firebase. Update the Firebase real time DB field of user id with a boolean=True at the end of the celery task. Implement a javascript function to listen to Firebase database user_id object changes -> update the UI

Related

How to determine the name of a task in celery?

I have a fastAPI app where I want to call a celery task
I can not import the task as they are in two different code base. So I have to call it using its name.
in tasks.py
imagery = Celery(
"imagery", broker=os.getenv("BROKER_URL"), backend=os.getenv("REDIS_URL")
)
...
#imagery.task(bind=True, name="filter")
def filter_task(self, **kwargs) -> Dict[str, Any]:
print('running task')
The celery worker is running with this command:
celery worker -A worker.imagery -P threads --loglevel=INFO --queues=imagery
Now in my FastAPI code base I want to run the filter task.
So my understanding is I have to use the celery.send_task() function
In app.py I have
from celery import Celery, states
from celery.execute import send_task
from fastapi import FastAPI
from starlette.responses import JSONResponse, PlainTextResponse
from app import models
app = FastAPI()
tasks = Celery(broker=os.getenv("BROKER_URL"), backend=os.getenv("REDIS_URL"))
#app.post("/filter", status_code=201)
async def upload_images(data: models.FilterProductsModel):
"""
TODO: use a celery task(s) to query the database and upload the results to S3
"""
data = ['ok', 'un test']
data = ['ok', 'un test']
result = tasks.send_task('workers.imagery.filter', args=list(data))
return PlainTextResponse(f"here is the id: {str(result.ready())}")
After calling the /filter endpoint, I don't see any task being picked up by the worker.
So I tried different name in send_task()
filter
imagery.filter
worker.imagery.filter
How come my task never get picked up by the worker and nothing shows in the log?
Is my task name wrong?
Edit:
The worker process run in docker. Here is the fullpath of the file on its disk.
tasks.py : /workers/worker.py
So if I follow the import schema. the name of the task would be workers.worker.filter but this does not work, nothing get printed in the logs of docker. Is a print supposed to appear in the STDOUT of the celery cli?
Your Celery worker is subscribed to the imagery queue only . On the other hand, you try to send the task to the default queue (if you did not change configuration, the name of that queue is celery) with result = tasks.send_task('workers.imagery.filter', args=list(data)). It is not surprising you do not see task being executed by your worker as you have been sending tasks to the default queue whole time.
To fix this, try the following:
result = tasks.send_task('workers.imagery.filter', args=list(data), queue='imagery')
OP Here.
This is the solution I used.
task = signature("filter", kwargs=data.dict() ,queue="imagery")
res = task.delay()
As mentioned by #DejanLekic I had to specify the queue.

How to get flask request context in celery task?

I have a flask server running within a gunicorn.
In my flask application I want to handle large upload files (>20GB), so I plan on letting a celery task do the handling of the large file.
The problem is that retrieving the file from request.files already takes quite long, in the meantime gunicorn terminates the worker handling that request. I could increase the timeout time, but the maximum file size is currently unknown, so I don't know how much time I would need.
My plan was to make the request context available to the celery task, as it was described here: http://xion.io/post/code/celery-include-flask-request-context.html, but I cannot make it work
Q1 Is the signature right?
I set the signature with
celery.signature(handle_large_file, args={}, kwargs={})
and nothing is complaining. I get the arguments I pass from the flask request handler to the celery task, but that's it. Should I somehow get a handle to the context here?
Q2 how to use the context?
I would have thought if the flask request context was available I could just use request.files in my code, but then I get the warning that I am out of context.
Using celery 4.4.0
Code:
# in celery.py:
from flask import request
from celery import Celery
celery = Celery('celery_worker',
backend=Config.CELERY_RESULT_BACKEND,
broker=Config.CELERY_BROKER_URL)
#celery.task(bind=True)
def handle_large_file(task_object, data):
# do something with the large file...
# what I'd like to do:
files = request.files['upfile']
...
celery.signature(handle_large_file, args={}, kwargs={})
# in main.py
def create_app():
app = Flask(__name__.split('.')[0])
...
celery_worker.conf.update(app.config)
# copy from the blog
class RequestContextTask(Task):...
celery_worker.Task = RequestContextTask
# in Controller.py
#FILE.route("", methods=['POST'])
def upload():
data = dict()
...
handle_large_file.delay(data)
What am I missing?

Django steps or process messages via REST

For learning purpose I want to implement the next thing:
I have a script that runs selenium for example in the background and I have some log messages that help me to see what is going on in the terminal.
But I want to get the same messages in my REST request to the Angular app.
print('Started')
print('Logged in')
...
print('Processing')
...
print('Success')
In my view.py file
class RunTask(viewsets.ViewSet):
queryset = Task.objects.all()
#action(detail=False, methods=['GET'], name='Run Test Script')
def run(self, request, *args, **kwargs):
task = task()
if valid['success']:
return Response(data=task)
else:
return Response(data=task['message'])
def task()
print('Staring')
print('Logged in')
...
print('Processing')
...
print('Success')
return {
'success': True/False,
'message': 'my status message'
}
Now it shows me only the result of the task. But I want to get the same messages to indicate process status in frontend.
And I can't understand how to organize it.
Or how I can tell angular about my process status?
Unfortunately, it's not that simple. Indeed, the REST API lets you start the task, but since it runs in the same thread, the HTTP request will block until the task is finished before sending the response. Your print statements won't appear in the HTTP response but on your server output (if you look at the shell where you ran python manage.py runserver, you'll see those print statements).
Now, if you wish to have those output in real-time, you'll have to look for WebSockets. They allow you to open a "tunnel" between the browser and the server, and send/receive messages in real-time. The django-channels library allow you to implement them.
However, for long-running background tasks (like a Selenium scraper), I would advise to look into the Celery task queue. Basically, your Django process will schedule task into the queue. The tasks into the queue will then be executed by one (or more !) "worker" processes. The advantage of this is that your Django process won't be blocked by the long task: it justs add some work into the queue and then respond.
When you add tasks in the queue, Celery will give you a unique identifier for this task, that you can return in the HTTP response. You can then very well implement another endpoint which takes a task id in parameter and return the state of the task (is it pending ? done ? failed ?).
For this to work, you'll have to setup a "broker", a kind of database that will store the tasks to do and their results (typically RabbitMQ or Redis). Celery documentation explains this well: https://docs.celeryproject.org/en/latest/getting-started/brokers/index.html
Either way you choose, it's not a trivial thing and will need quite some work before having some results ; but it's interesting to see how it expands the possibilities of a classical HTTP server.

Python Celery: Update django model after state change

I managed to find 2 similar topics with discussion about this issue, but unfortunately I couldn't get what is the best solution from it:
Update Django Model Field Based On Celery Task Status
Update Django Model Field Based On Celery Task Status
I use Django & Celery (+redis as Message broker) and I would like to update django model when celery task status changes(from pending -> success, pending -> failure) etc.
My code:
import time
from celery import shared_task
#shared_task(name="run_simulation")
def run_simulation(simulation_id: str):
t1_start = time.perf_counter()
doSomeWork() # we may change this to sleep for instance
t1_end = time.perf_counter()
return{'process_time': t1_end - t1_start}
and the particular view from which I am calling the task:
def run_simulation(request):
form = SimulationForm(request.POST)
if form.is_valid():
new_simulation = form.save()
new_simulation.save()
task_id = tasks.run_simulation.delay(new_simulation.id)
The question is, what is the preferred way to update django model state of Simulation when status of task has been changed?
In the docs I found handlers that are using methods on_failure, on_success etc. http://docs.celeryproject.org/en/latest/userguide/tasks.html#handlers
I don't think there's a preferred method to do something like this since it depends on your project.
You can use a monitoring task like the link you sent. Give the task a task id and re-schedule the task until the monitored task is in its FINISHED state.
from celery import AsyncResult
#app.task(bind=True)
def monitor_task(self, t_id):
"""Monitor a task"""
res = AsyncResult(t_id, backend=self.backend, app=self.app)
if res.ready():
raise self.retry(
countdown=10,
exc=Exception("Main task not done yet.")
)
You can also create an event receiver and check the state of the task then save it on the DB.
http://docs.celeryproject.org/en/latest/userguide/monitoring.html#real-time-processing
Now, if you are only interested in success and failure states then you can create success and failure callbacks and the care of saving the success or failure state in the DB there.
tasks.run_simulation.apply_async(
(sim_id,),
link=tasks.success_handler.s(),
link_error=tasks.error_handler()
)
http://docs.celeryproject.org/en/latest/userguide/calling.html#linking-callbacks-errbacks

Celery have task wait for completion of same task called previously with shared argument

I am currently trying to setup celery to handle responses from a chatbot and forward those responses to a user.
The chatbot hits the /response endpoint of my server, that triggers the following function in my server.py module:
def handle_response(user_id, message):
"""Endpoint to handle the response from the chatbot."""
tasks.send_message_to_user.apply_async(args=[user_id, message])
return ('OK', 200,
{'Content-Type': 'application/json; charset=utf-8'})
In my tasks.py file, I import celery and create the send_message_to_user function:
from celery import Celery
celery_app = Celery('tasks', broker='redis://')
#celery_app.task(name='send_message_to_user')
def send_message_to_user(user_id, message):
"""Send the message to a user."""
# Here is the logic to send the message to a specific user
My problem is, my chatbot may answer multiple messages to a user, so the send_message_to_user task is properly put in the queue but then a race condition arises and sometimes the messages arrive to the user in the wrong order.
How could I make each send_message_to_user task wait for the previous task with the same name and with the same argument "user_id" before executing it ?
I have looked at this thread Running "unique" tasks with celery but a lock isn't my solution, as I don't want to implement ugly retries when the lock is released.
Does anyone have any idea how to solve that issue in a clean(-ish) way ?
Also, it's my first post here so I'm open to any suggestions to improve my request.
Thanks!

Categories