Running Django celery on load - python

Hi I am working on a project where i need celery beat to run long term periodic tasks. But the problem is that after starting the celery beat it is taking the specified time to run for the first time.
I want to fire the task on load for the first time and then run periodically.
I have seen this question on stackoverflow and this issue on GitHub, but didn't found a reliable solution.
Any suggestions on this one?

Since this does not seem possible I suggest a different approach. Call the task explicitly when you need and let the scheduler continue scheduling the tasks as usual. You can call the task on startup by using one the following methods (you probably need to take care of the multiple calls of the ready method if the task is not idempotent). Alternatively call the task from the command line by using celery call after your django server startup command.

The best place to call it will most of the times be in the ready() function of the current apps AppConfig class:
from django.apps import AppConfig
from myapp.tasks import my_task
class RockNRollConfig(AppConfig):
# ...
def ready(self):
my_task.delay(1, 2, 3)
Notice the use of .delay() which puts the invocation on the celery que and doesn't slow down starting the server.
See: https://docs.djangoproject.com/en/3.2/ref/applications/#django.apps.AppConfig and https://docs.celeryproject.org/en/stable/userguide/calling.html.

Related

Enqueue celery task from other project

I have a project using celery to process tasks, and a second project which is an API that might need to enqueue tasks to be processed by celery workers.
However, these 2 projects are separated and I can't import the tasks in the API one.
I've used Sidekiq - Celery's equivalent in Ruby - in the past, and for example it is possible to push jobs by storing data in Redis from other languages/apps/processes if using the same format/payload.
Is something similar possible with Celery ? I couldn't find anything related.
Yes, this is possible in celery using send_task or signatures. Assuming fetch_data is the function in a separate code base, you can invoke it using one of the below methods
send_task
celery_app.send_task('fetch_data', kwargs={'url': request.json['url']})
app.signature
celery_app.signature('fetch_data', kwargs={'url': request.json['url']).delay()
You just specify the function name as a string and do not need to import it into your codebase.
You can read about this in more detail from https://www.distributedpython.com/2018/06/19/call-celery-task-outside-codebase/

Django 3.0: Running backgound infinite loop in app ready()

I'm trying to find a way to constantly poll a server every x seconds from the ready() function of Django, basically something which looks like this:
from django.apps import AppConfig
class ApiConfig(AppConfig):
name = 'api'
def ready(self):
import threading
import time
from django.conf import settings
from api import utils
def refresh_ndt_servers_list():
while True:
utils.refresh_servers_list()
time.sleep(settings.WAIT_SECONDS_SERVER_POLL)
thread1 = threading.Thread(target=refresh_ndt_servers_list)
thread1.start()
I just want my utils.refresh_servers_list() to be executed when Django starts/is ready and re-execute that same method (which populates my DB) every settings.WAIT_SECONDS_SERVER_POLL seconds indefinitely. The problem with that is if I run python manage.py migrate the ready() function gets called and never finishes. I would like to avoid calling this function during migration.
Thanks!
AppConfig.ready() is to "... perform initialization tasks ..." and make your app ready to run / serve requests. Actual app working logic should be run after django app is initialized.
For launching task at regular intervals cron job can be used.
Or, setup a periodic celery task with celery beat.
Also, provided task seems to be performing updates in database (good for it to be atomic). It may be critical for it to have only single running instance of it. One instance of cronjob or one celery task take care of that.
However, the next job may still run if previous one has not yet finished or just be launched manually for some reason - adding some locking logic into task to check that only one is running (or lock database table for the run) may be desired.

How to replace threads in Django to Celery worker

I know it's not the best practice to use threads in django project but I have a project that is using threads:
threading.Thread(target=save_data, args=(dp, conv_handler)).start()
I want to replace this code to celery - to run worker with function
save_data(dispatcher, conversion)
Inside save_data I have infinite loop and in this loop I save states of dispatcher and conversation to file on disk with pickle.
I want to know may I use celery for such work?
Does the worker can see changes of state in dispatcher and conversation?
I personally don't like long running tasks in Celery. Normally you will have a maximum task time and if your task takes too much time it can time out. The best tasks for celery are quick and stateless tasks.
Notice that Celery params are serialized when you launch a task and it's tricky passing a python object as a task argument (not recommended).
I would need more info about the problem you are trying to solve but if dispatcher & conversion are django objects I would do something like:
def save_data(dispatcher_id, conversion_id):
dispatcher = Dispatcher.objects.get(id=dispatcher_id)
conversion = Conversion.objects.get(id_conversion_id)
And you should avoid that infinite loop in a celery task. You can workaround the infinite loop by calling this save_task periodically but I encourage you to find a solution that matches better with Celery (try to be stateless, quick tasks).

Run task on Models datetime in django

Im working on a project and cant solve, probably, a simple issue.
I have some datetime in a Model and I need to run some code when the current time reaches the Model datetime, so to say it is a sheduler with the provision from Models, also there is need to add some occurancies like every day, year, ...
I wonder if there is a simple nice solution.
Thanks forward....
I think you can have two solutions.
The simplest one is to create management commands to do what you need to do, and use the django.utils.timezone.now as starting value to filter your datetime in the models. You can create many commands as you wish like
run_hourly
run_daily
run_weekly
Then you can setup cron on linux to run the management commands when you need it.
Another solution is to use a task queues tool like Celery or RQ.
Celery needs to be configured and you must also setup your server to run Celery and the Celery beat schedule to run the tasks at specific time. If you don't have any specific requirements and your needs is just run a couple of tasks I would use cron instead of any Task Queues
More about Task Queues software here: https://www.fullstackpython.com/task-queues.html
I`ve found extremely usefull article. Everything works fine but PyCharm detects failure importing these
from celery.task.schedules import crontab
from celery.decorators import periodic_task
But still it doesn`t gives an error during server running.

how to track revoked tasks in across multiple celeryd processes

I have a reminder type app that schedules tasks in celery using the "eta" argument. If the parameters in the reminder object changes (e.g. time of reminder), then I revoke the task previously sent and queue a new task.
I was wondering if there's any good way of keeping track of revoked tasks across celeryd restarts. I'd like to have the ability to scale celeryd processes up/down on the fly, and it seems that any celeryd processes started after the revoke command was sent will still execute that task.
One way of doing it is to keep a list of revoked task ids, but this method will result in the list growing arbitrarily. Pruning this list requires guarantees that the task is no longer in the RabbitMQ queue, which doesn't seem to be possible.
I've also tried using a shared --statedb file for each of the celeryd workers, but it seems that the statedb file is only updated on termination of the workers and thus not suitable for what I would like to accomplish.
Thanks in advance!
Interesting problem, I think it should be easy to solve using broadcast commands.
If when a new worker starts up it requests all the other workers to dump its revoked
tasks to the new worker. Adding two new remote control commands,
you can easily add new commands by using #Panel.register,
Module control.py:
from celery.worker import state
from celery.worker.control import Panel
#Panel.register
def bulk_revoke(panel, ids):
state.revoked.update(ids)
#Panel.register
def broadcast_revokes(panel, destination):
panel.app.control.broadcast("bulk_revoke", arguments={
"ids": list(state.revoked)},
destination=destination)
Add it to CELERY_IMPORTS:
CELERY_IMPORTS = ("control", )
The only missing problem now is to connect it so that the new worker
triggers broadcast_revokes at startup. I guess you could use the worker_ready
signal for this:
from celery import current_app as celery
from celery.signals import worker_ready
def request_revokes_at_startup(sender=None, **kwargs):
celery.control.broadcast("broadcast_revokes",
destination=sender.hostname)
I had to do something similar in my project and used celerycam with django-admin-monitor. The monitor takes a snapshot of tasks and saves them in the database periodically. And there is a nice user interface to browse and check the status of all tasks. And you can even use it even if your project is not Django based.
I implemented something similar to this some time ago, and the solution I came up with was very similar to yours.
The way I solved this problem was to have the worker fetch the Task object from the database when the job ran (by passing it the primary key, as the documentation recommends). In your case, before the reminder is sent the worker should perform a check to ensure that the task is "ready" to be run. If not, it should simply return without doing any work (assuming that the ETA has changed and another worker will pick up the new job).

Categories