How to pass another function or instance to the celery task? - python

I have created a celery task as below
import os
import time
from celery import Celery
from dotenv import load_dotenv
load_dotenv()
celery = Celery(__name__)
celery.conf.broker_url = os.environ.get("CELERY_BROKER_URL")
celery.conf.result_backend = os.environ.get("CELERY_RESULT_BACKEND")
#celery.task(name="create_task")
def message_sender(sender_func, numbers: list, message: str):
sender_func(numbers, message)
return "Sent Successful"
And calling the task as below
modem_conn = Modem()
task = message_sender.apply_async(
kwargs={
"sender_func": modem_conn.sms,
"numbers": ["00000000"],
"message": "sms sent",
}
)
But I am getting bellow error
kombu.exceptions.EncodeError: Object of type method is not JSON serializable
But if I call the task without delay or apply_async, then it workes. What could be the problem here and how can I achive this.
All I want to do is pass a function or instance while calling the celery task.

The celery task is run in another instance than your app, and both instances communicate via the broker. Since you don't "call" the task function, but only send messages with serialized data that tell the worker which function to call, you can't send objects or functions. This is similar to multiprocessing, where only serialized text messages can be sent between the processes.
My approach would be to make the function known to the worker and then send e.g. a string with the name of the function and call it.
sender:
task = message_sender.apply_async(
kwargs={
"sender_func": "sms",
"numbers": ["00000000"],
"message": "sms sent",
}
)
worker:
#celery.task(name="create_task")
def message_sender(sender_func, numbers: list, message: str):
modem_conn = Modem()
if sender_func == "sms":
modem_conn.sms(numbers, message)
return "Sent Successful"
You could also use getattr or locals()

Related

Celery task_routes and task_create_missing_queues note working

I have configured a CentOS server with (server A) :
Redis 6.2.6
Python 3.8.1
Celery 5.2.1
Flower
On another server (server B) I send tasks to redis that are executed by a worker on server A.
I want to try to implement multiple queues and change de name of the default queue : queues "high", "normal" and "low", with "normal" beeing the default queue (instead of celery queue).
If I hardcode the queue in the task, it works (they are send to the right queue).
But if I try to do it with "task_routes", it doesn't.
I tried to:
hardcode route in this settings
implement a routing function
implement a routing class with the routing function
But it never works. Moreover, even though I put the "task_create_missing_queue" parameter to False, all tasks are send to celery queue (the default queue is still celery and not normal...).
Any ideas ?
Here is the code for my celeryconfig.py file (with routing function:
from celery import Celery
from kombu import Exchange, Queue
from celery.exceptions import Reject
import re
task_create_missing_queues = False
task_queues = (
Queue('high', Exchange('high'), routing_key='high'),
Queue('normal', Exchange('normal'), routing_key='normal'),
Queue('low', Exchange('low'), routing_key='low')
)
task_default_queue = 'normal'
task_default_exchange = 'normal'
task_default_routing_key = 'normal'
def route_tasks (name, args, kwargs, options, task=None, **kw):
if ':' not in name:
return {'queue': 'normal'}
namespace, _ = name.split(':')
return {'queue': namespace}
task_routes = (route_tasks,)
And my tasks are defined with a name like this with the decorator (when I hardcode the queue for each task, it is there that I add "queue="queue_name") (my celery app is called celery):
#celery.task(bind=True, name="high:long_task")
I can see in flower that the configuration is well taken into account.
My worker is started with multi and -Q high,normal,low.
Many thanks!

How to stop the execution of a long process if something changes in the db?

I have a view that sends a message to a RabbitMQ queue.
message = {'origin': 'Bytes CSV',
'data': {'csv_key': str(csv_entry.key),
'csv_fields': csv_fields
'order_by': order_by,
'filters': filters}}
...
queue_service.send(message=message, headers={}, exchange_name=EXCHANGE_IN_NAME,
routing_key=MESSAGES_ROUTING_KEY.replace('#', 'bytes_counting.create'))
On my consumer, I have a long process to generate a CSV.
def create(self, data):
csv_obj = self._get_object(key=data['csv_key'])
if csv_obj.status == CSVRequestStatus.CANCELED:
self.logger.info(f'CSV {csv_obj.key} was canceled by the user')
return
result = self.generate_result_data(filters=data['filters'], order_by=data['order_by'], csv_obj=csv_obj)
csv_data = self._generate_csv(result=result, csv_fields=data['csv_fields'], csv_obj=csv_obj)
file_key = self._post_csv(csv_data=csv_data, csv_obj=csv_obj)
csv_obj.status = CSVRequestStatus.READY
csv_obj.status_additional = CSVRequestStatusAdditional.SUCCESS
csv_obj.file_key = file_key
csv_obj.ready_at = timezone.now()
csv_obj.save(update_fields=['status', 'status_additional', 'ready_at', 'file_key'])
self.logger.info(f'CSV {csv_obj.name} created')
The long proccess happens inside self._generate_csv, because self.generate_result_data returns a queryset, which is lazy.
As you can see, if a user changes the status of the csv_request through an endpoint BEFORE the message starts to be consumed the proccess will not be evaluated. My goal is to let this happen during the execution of self._generate_csv.
So far I tried to use Threading, but unsuccessfully.
How can I achive my goal?
Thanks a lot!
Why don't you checkout Celery library ? Using celery with django with RabbitMQ backend is much easier than directly leveraging rabbitmq queues.
Celery has an inbuilt function revoke to terminate an ongoing task:
>>> from celery.task.control import revoke
>>> revoke(task_id, terminate=True)
related SO answer
celery docs
For your use case, you probably want something like (code snippets):
## celery/tasks.py
from celery import app
#app.task(queue="my_queue")
def create_csv(message):
# ...snip...
pass
## main.py
from celery import uuid, current_app
def start_task(task_id, message):
current_app.send_task(
"create_csv",
args=[message],
task_id=task_id,
)
def kill_task(task_id):
current_app.control.revoke(task_id, terminate=True)
## signals.py
from django.dispatch import receiver
from .models import MyModel
from .main import kill_task
# choose appropriate signal to listen for DB change
#receiver(models.signals.post_save, sender=MyModel)
def handler(sender, instance, **kwargs):
kill_task(instance.task_id)
Use celery.uuid to generate task IDs which can be stored in DB or cache and use the same task ID to control the task i.e. request termination.
Since self._generate_csv is the slowest, the obvious solution would be to work with this function.
To do this, you can divide the creation of the csv file into several pieces. After creating each piece, check the status and see if you can continue to create the file. At the very end, glue all the pieces into a finished file.
Here is a method for combining multiple files into one.

Telebot + Celery + pytransitions: response to task

I want to send some message after time delay, using Celery. After users get message, its trigger new state. For this I need telebot.types.Message object to be send as argument in Celery task. How can I do this correctly?
My transition function to start Celery task:
def delay_message(self, event):
celery_utils.delay_message.apply_async(kwargs={'response': self.response}, countdown=1) # self.response is telebot.types.Message
Celery task:
#celery.task()
def delay_message(response):
machine = routes.DialogMachine(transitions=app.config['transitions'])
machine.response = response
machine.send_random_motivation_message()
In send_random_motivation_message() I need telebot.types.Message as self.response, but can't send this type to Celery task.
I assume you can't send it because it is not serializable, right? If that is the case, your only option is to send as many parameters as needed as dictionary, or tuple, and create the telebot.types.Message inside the Celery task.
You could try the jsonpickle to generate JSON out of the pickled telebot.types.Message object, pass it to your Celery task, and inside the task use jsonpickle to recreate the object.

SimpleXMLRPCServer calling Celery tasks

I'm trying to make a simple RPC server with SimpleXMLRPCServer and Celery. Basically, the idea is that a remote client (client.py) can call tasks via xmlrpc.client to the server (server.py) which includes functions registered as Celery tasks (runnable.py).
The problem is, when RPC function is registered via register_function I can call it directly by its name, so it will be executed properly, but without using Celery. What I would like to achieve is to call it via name.delay() within client.py, the way it will be executed by Celery, but without locking the server thread. So, server.py should act like a proxy and allow multiple clients to call complete set of functions like:
for task in flow:
job = globals()[task]
job.delay("some arg")
while True:
if job.ready():
break
I've tried using register_instance with allow_dotted_names=True, but I came to an error:
xmlrpc.client.Fault: <Fault 1: "<class 'TypeError'>:cannot marshal <class '_thread.RLock'> objects">
Which led me to the question - if it's even possible to do something like this
Simplified code:
server.py
# ...runnable.py import
# ...rpc init
def register_tasks():
for task in get_all_tasks():
setattr(self, task, globals()[task])
self.server.register_function(getattr(self, task), task)
runnable.py
app = Celery("tasks", backend="amqp", broker="amqp://")
#app.task()
def say_hello():
return "hello there"
#app.task()
def say_goodbye():
return "bye, bye"
def get_all_tasks():
tasks = app.tasks
runnable = []
for t in tasks:
if t.startswith("modules.runnable"):
runnable.append(t.split(".")[-1])
return runnable
Finally, client.py
s = xmlrpc.client.ServerProxy("http://127.0.0.1:8000")
print(s.say_hello())
I've came up with an idea which creates some extra wrappers for Celery delay functions. Those are registered the way RPC client can call rpc.the_remote_task.delay(*args). This returns Celery job ID, then, client asks whether the job is ready via rpc.ready(job_id) and gets results with rpc.get(job_id). As for now, there's an obvious security hole as you can get results when you know the job ID, but still - it works fine.
Registering tasks (server.py)
def register_tasks():
for task in get_all_tasks():
exec("""def """ + task + """_runtime_task_delay(*args):
return celery_wrapper(""" + task + """, "delay", *args)
setattr(self, task + "_delay", """ + task + """_runtime_task_delay)
""")
f_delay = task + "_delay"
self.server.register_function(getattr(self, f_delay), task + ".delay")
def job_ready(jid):
return celery_wrapper(None, "ready", jid)
def job_get(jid):
return celery_wrapper(None, "get", jid)
setattr(self, "ready", job_ready)
setattr(self, "get", job_get)
self.server.register_function(job_ready, "ready")
self.server.register_function(job_get, "get")
The wrapper (server.py)
def celery_wrapper(task, method, *args):
if method == "delay":
job = task.delay(*args)
job_id = job.id
return job_id
elif method == "ready":
res = app.AsyncResult(args[0])
return res.ready()
elif method == "get":
res = app.AsyncResult(args[0])
return res.get()
else:
return "0"
And the RPC call (client.py)
jid = s.the_remote_task.delay("arg1", "arg2")
is_running = True
while is_running:
is_running = not s.ready(jid)
if not is_running:
print(s.get(jid))
time.sleep(.01)

Celery dynamic queue creation and routing

I'm trying to call a task and create a queue for that task if it doesn't exist then immediately insert to that queue the called task. I have the following code:
#task
def greet(name):
return "Hello %s!" % name
def run():
result = greet.delay(args=['marc'], queue='greet.1',
routing_key='greet.1')
print result.ready()
then I have a custom router:
class MyRouter(object):
def route_for_task(self, task, args=None, kwargs=None):
if task == 'tasks.greet':
return {'queue': kwargs['queue'],
'exchange': 'greet',
'exchange_type': 'direct',
'routing_key': kwargs['routing_key']}
return None
this creates an exchange called greet.1 and a queue called greet.1 but the queue is empty. The exchange should be just called greet which knows how to route a routing key like greet.1 to the queue called greet.1.
Any ideas?
When you do the following:
task.apply_async(queue='foo', routing_key='foobar')
Then Celery will take default values from the 'foo' queue in CELERY_QUEUES,
or if it does not exist then automatically create it using (queue=foo, exchange=foo, routing_key=foo)
So if 'foo' does not exist in CELERY_QUEUES you will end up with:
queues['foo'] = Queue('foo', exchange=Exchange('foo'), routing_key='foo')
The producer will then declare that queue, but since you override the routing_key,
actually send the message using routing_key = 'foobar'
This may seem strange but the behavior is actually useful for topic exchanges,
where you publish to different topics.
It's harder to do what you want though, you can create the queue yourself
and declare it, but that won't work well with automatic message publish retries.
It would be better if the queue argument to apply_async could support
a custom kombu.Queue instead that will be both declared and used as the destination.
Maybe you could open an issue for that at http://github.com/celery/celery/issues

Categories