I have the following situatuion.
My client sends following task to the worker:
# client
task = my_task.apply_async((some_params), queue='my_queue')
# task.get() # This blocks
My worker executes the task properly and returns the result.
So retrieving the result with task.get() works but it blocks. Now what I'd like to have is a callback that is called when a result (success or failure) is available.
There is a on_success function of the Task class. But that is used in the worker. Similiar Question
Any ideas or solutions?
You can have callbacks with a task, but it's not the caller or client that can be notified or called back (because celery is out of process), it will be another celery task that has to be used as the callback. If you would like to use a callback, you can use celery's link or canvassing functionality.
For a simple callback that uses the result of the initiating task you can do the following:
#app.task
def add(m, n):
return m + n
#app.task
def callback(result):
print(f'My result was {result}')
def client_caller():
add.apply_async(args=(2, 2), link=callback.s())
Related
I created API on Python and i want to start some long function, but I want to tell user that my endpoint worked successfully and i some task started in execution
I want to do it because i want so that the user does not wait for the function to be executed
If it were represented in pseudocode, it would probably look like this:
async my_endpoint(context):
func_name = context.func_name
<something_validation_block>
return 204 if all right
So, how created in one function ?
I tried something as:
async def handle(context):
<validate_block>
threading.Thread(
target=logn_func, args=(context,),
).start()
return 204
But unfortunately it does not work : (
First, asyncio has a method named asyncio.to_thread docs
It's provide a friendly method to work with async and threading.
(Or you can run task in threading pool docs)
then, you can use asyncio.create_task(coro) to run async function in background
it will return a Task object which is awaitable, or use task.add_done_callback to handle result.
import asyncio
import time
def block() -> str:
print("block function start")
time.sleep(1)
print("block function done")
return "result"
async def main() -> int:
task = asyncio.get_running_loop().run_in_executor(None, block)
task.add_done_callback(lambda task: print("task with result:", task.result()))
print("return 204")
return 204
asyncio.run(main())
block function start
return 204
block function done
task with result: result
NOTE: Save a reference to tasks, to avoid a task disappearing mid-execution. The event loop only keeps weak references to tasks. A task that isn’t referenced elsewhere may get garbage collected at any time, even before it’s done.
There are two services running in my application (Srvice1, Servie2).
A request is sent from the client to Service1 and Service1 calls a task as follows.
result = my_task.apply_async (kwargs = data)
And my_task calls an operation on server two:
#shared_task()
def my_task(** kwargs):
return Server2.do_job(kwargs)
Similarly, Service2 performs a series of celery tasks and eventually has to return the output to Service1.
Question:
How can I wait for the answer to be returned from server two?
I used result.get() to solve the problem but got no result.
Assuming Redis is our Celery message queue. is apply_async considered a blocking I/O function, in other words, is this correct code in django 3.1 view, or will it block the event loop and needing sync_to_async wrapping:
async def django_view(request):
celery_task.apply_async()
return success_page
I've seen that with FastAPI, if the redis db is down all the application hangs waiting for a connection.
With "all the application" I mean that other requests are stuck as well.
This means that the coroutine that makes the call is not suspended: it's waiting on a sync call and blocking all the others.
This is an example on how I solved:
self.loop = asyncio.get_event_loop()
partial_delay = lambda: entry.celery_task.delay(
command_type=entry.command_type, command_body=entry.command_body
)
self.loop.run_in_executor(None, partial_delay)
I have a tornado application which needs to run a blocking function on ProcessPoolExecutor. This blocking function employs a library which emits incremental results via blinker events. I'd like to collect these events and send them back to my tornado app as they occur.
At first, tornado seemed ideal for this use case because its asynchronous. I thought I could simply pass a tornado.queues.Queue object to the function to be run on the pool and then put() events onto this queue as part of my blinker event callback.
However, reading the docs of tornado.queues.Queue, I learned they are not managed across processes like multiprocessing.Queue and are not thread safe.
Is there a way to retrieve these events from the pool as they occur? Should I wrap multiprocessing.Queue so it produces Futures? That seems unlikely to work as I doubt the internals of multiprocessing are compatible with tornado.
[EDIT]
There are some good clues here: https://gist.github.com/hoffrocket/8050711
To collect anything but the return value of a task passed to a ProcessPoolExecutor, you must use a multiprocessing.Queue (or other object from the multiprocessing library). Then, since multiprocessing.Queue only exposes a synchronous interface, you must use another thread in the parent process to read from the queue (without reaching into implementation details. There's a file descriptor that could be used here, but we'll ignore that for now since it's undocumented and subject to change).
Here's a quick untested example:
queue = multiprocessing.Queue()
proc_pool = concurrent.futures.ProcessPoolExecutor()
thread_pool = concurrent.futures.ThreadPoolExecutor()
async def read_events():
while True:
event = await thread_pool.submit(queue.get)
print(event)
async def foo():
IOLoop.current.spawn_callback(read_events)
await proc_pool.submit(do_something_and_write_to_queue)
You can do it more simply than that. Here's a coroutine that submits four slow function calls to subprocesses and awaits them:
from concurrent.futures import ProcessPoolExecutor
from time import sleep
from tornado import gen, ioloop
pool = ProcessPoolExecutor()
def calculate_slowly(x):
sleep(x)
return x
async def parallel_tasks():
# Create futures in a randomized order.
futures = [gen.convert_yielded(pool.submit(calculate_slowly, i))
for i in [1, 3, 2, 4]]
wait_iterator = gen.WaitIterator(*futures)
while not wait_iterator.done():
try:
result = await wait_iterator.next()
except Exception as e:
print("Error {} from {}".format(e, wait_iterator.current_future))
else:
print("Result {} received from future number {}".format(
result, wait_iterator.current_index))
ioloop.IOLoop.current().run_sync(parallel_tasks)
It outputs:
Result 1 received from future number 0
Result 2 received from future number 2
Result 3 received from future number 1
Result 4 received from future number 3
You can see that the coroutine receives results in the order they complete, not the order they were submitted: future number 1 resolves after future number 2, because future number 1 slept longer. convert_yielded transforms the Futures returned by ProcessPoolExecutor into Tornado-compatible Futures that can be awaited in a coroutine.
Each future resolves to the value returned by calculate_slowly: in this case it's the same number that was passed into calculate_slowly, and the same number of seconds as calculate_slowly sleeps.
To include this in a RequestHandler, try something like this:
class MainHandler(web.RequestHandler):
async def get(self):
self.write("Starting....\n")
self.flush()
futures = [gen.convert_yielded(pool.submit(calculate_slowly, i))
for i in [1, 3, 2, 4]]
wait_iterator = gen.WaitIterator(*futures)
while not wait_iterator.done():
result = await wait_iterator.next()
self.write("Result {} received from future number {}\n".format(
result, wait_iterator.current_index))
self.flush()
if __name__ == "__main__":
application = web.Application([
(r"/", MainHandler),
])
application.listen(8888)
ioloop.IOLoop.instance().start()
You can observe if you curl localhost:8888 that the server responds incrementally to the client request.
How to check that a function in executed by celery?
def notification():
# in_celery() returns True if called from celery_test(),
# False if called from not_celery_test()
if in_celery():
# Send mail directly without creation of additional celery subtask
...
else:
# Send mail with creation of celery task
...
#celery.task()
def celery_test():
notification()
def not_celery_test():
notification()
Here is one way to do it by using celery.current_task. Here is the code to be used by the task:
def notification():
from celery import current_task
if not current_task:
print "directly called"
elif current_task.request.id is None:
print "called synchronously"
else:
print "dispatched"
#app.task
def notify():
notification()
This is code you can run to exercise the above:
from core.tasks import notify, notification
print "DIRECT"
notification()
print "NOT DISPATCHED"
notify()
print "DISPATCHED"
notify.delay().get()
My task code in the first snippet was in a module named core.tasks. And I shoved the code in the last snippet in a custom Django management command. This tests 3 cases:
Calling notification directly.
Calling notification through a task executed synchronously. That is, this task is not dispatched through Celery to a worker. The code of the task executes in the same process that calls notify.
Calling notification through a task run by a worker. The code of the task executes in a different process from the process that started it.
The output was:
NOT DISPATCHED
called synchronously
DISPATCHED
DIRECT
directly called
There is no line from the print in the task on the output after DISPATCHED because that line ends up in the worker log:
[2015-12-17 07:23:57,527: WARNING/Worker-4] dispatched
Important note: I initially was using if current_task is None in the first test but it did not work. I checked and rechecked. Somehow Celery sets current_task to an object which looks like None (if you use repr on it, you get None) but is not None. Unsure what is going on there. Using if not current_task works.
Also, I've tested the code above in a Django application but I've not used it in production. There may be gotchas I don't know.