How do I schedule a job in Django? - python

I have to schedule a job using Schedule on my django web application.
def new_job(request):
print("I'm working...")
file=schedulesdb.objects.filter (user=request.user,f_name__icontains ="mp4").last()
file_initiated = str(f_name)
os.startfile(f_name_initiated)
I need to do it with filtered time in db
GIVEN DATETIME = schedulesdb.objects.datetimes('request_time', 'second').last()
schedule.GIVEN DATETIME.do(job)

Django is a web framework. It receives a request, does whatever processing is necessary and sends out a response. It doesn't have any persistent process that could keep track of time and run scheduled tasks, so there is no good way to do it using just Django.
That said, Celery (http://www.celeryproject.org/) is a python framework specifically built to run tasks, both scheduled and on-demand. It also integrates with Django ORM with minimal configuration. I suggest you look into it.
You could, of course, write your own external script that would use schedule module that you mentioned. You would need to implement a way to write shedule objects into the database and then you could have your script read and execute them. Is your "scheduledb" model already implemented?

Related

How to implement an API which will run a python script with the data from the POST request

I want to run a python script which basically monitors any changes happening to a particular directory (the directory to monitor is passed as part of the POST request). Every time the API is called (I'm using FastAPI), a new instance of the script has to be started to monitor that particular directory and send back a "success" message as response if the script was started successfully. Further I am planning to add another API endpoint that will stop the script that is running to watch a directory.
Can message queues like RQ or Celery be used to achieve this? Please note that I want new scripts to be started every time the API is called so multiple instances of the script should run at the same time. I am using watchdog module to monitor the file system.
I don't know how to do this in the correct way but this is what I have come up with so far where a new thread is created for each API call:
from fastapi import FastAPI
from schemas import Data # pydantic schema model for API
from threading import Thread
import filewatcher # the script that has to be run
#app.post('/register/event')
def register_watchdog(data: Data):
th = Thread(target=filewacther.create_watchdog, args=(data))
th.start()
return {"status": "success"}
What is the best way to achieve this? One further question is, can I implement my script as a Linux service that can run in the background?
In fact, it is rather trivial to call this function in the code if you need to return the result of directory browsing function.
...
def register_watchdog(data: Data):
return {"result": filewacther.create_watchdog(data)}
But if you want to run some time-consuming process in the background, you really should use AMQP with the workers. RabbitMQ with Celery is really the right choice for this, it will make it easy to scale your system.
What is the best way to achieve this? One further question is, can I implement my script as a Linux service that can run in the background?
Yes, you can indeed run RabbitMQ together with Celery as a linux service in the background e.g. using supervisor (example), but this is not the best practice. Look in the direction of containerizing elements of your system. You can wrap Celery in a Docker and make it easy enough to run that along with the AMQP service and your web application (example).

Django: How to ignore tasks with Celery?

Without changing the code itself, Is there a way to ignore tasks in Celery?
For example, when using Django mails, there is a Dummy Backend setting. This is perfect since it allows me, from a .env file to deactivate mail sending in some environments (like testing, or staging). The code itself that handles mail sending is not changed with if statements or decorators.
For celery tasks, I know I could do it in code using mocks or decorators, but I'd like to do it in a clean way that is 12factors compliant, like with Django mails. Any idea?
EDIT to explain why I want to do this:
One of the main motivation behind this, is that it creates coupling between Django web server and Celery tasks.
For example, when running unit tests, if the broker server (Redis for me) is not running, then if delay() method is called, it freezes forever, because there is no timeout when Celery tries to send a task to Redis.
From an architecture view, this is very bad. I'd like my unit tests can run properly without the requirement to run a Celery broker!
Thanks!
As far as the coupling is concerned, your Django application would still be tied to celery if you use a dummy backend. Just your tasks won't execute. Maybe this is acceptable in your case but in my opinion, it can cause some problems. For example, if the piece of code you are trying to test, submits a task to celery, and in a later part, tries to retrieve the result for that task, it will fail. Because the dummy backend will never execute the task.
For unit testing, as you mentioned in your question, you can use the task_always_eager setting. If you turn it on, your Django app will no longer depend upon a running worker. It will execute tasks in the same thread in a synchronous fashion and return the result.

The best practice to run periodic tasks in Python?

In my system user is allowed to set notifications schedule. He can choose any date and time when he wants to get messages. I have discovered one mechanims is named as Celery in Python. That executes tasks asyncronly. Due this I have pair of questions:
How to intergrate Celery with user interface?
Are there any Celery alternatives?
Is it panacea?
What you are looking for is something to process background tasks submitted to a queue from your web server. To that end, Celery is a good option and easy to configure. A more comprehensive list can be found here. None of these options would integrate with a user interface, they would integrate with your web server. They can queue jobs based on what is sent from the client side, which could be included as part of handling the request-response flow.
Also, this article provides a good reference for how to schedule periodic tasks using celery.

Celery: why do I need a broker for periodic tasks?

I have a standalong script that scrapes a page, initiates a connection to a database, and writes database to it. I need it to execute periodically after x hours. I can make it with using a bash script, with the pseudocode:
while true
do
python scraper.py
sleep 60*60*x
done
From what I read about message brokers, they are used for sending "signals" from one running program to another, like HTTP in principle. Like I have a piece of code that accepts an email id from user, it sends signal with email-id to another piece of code that will send the email.
I need celery to run a periodic task on heroku. I already have a mongodb on a separate server. WHy do I need to run another server for rabbitmq or redis just for this? Can I use celery without the broker?
Celery architecture is designed to scale and distribute tasks across several servers. For sites like yours it might be an overkill. Queue service is generally needed to maintain the task list and signal the status of finished tasks.
You might want to take a look in Huey instead. Huey is small-scale Celery "Clone" needing only Redis as an external dependency, not RabbitMQ. It's still using Redis queue mechanism to line the tasks in queue.
There also exists Advanced Python scheduler which does not need even Redis, but can hold the state of the queue in memory in-process.
Alternatively if you have very small amount of periodical tasks, no delayed tasks, I would just use Cron and pure Python scripts to run the tasks.
As the Celery documentation explains:
Celery communicates via messages, usually using a broker to mediate between clients and workers. To initiate a task, a client adds a message to the queue, which the broker then delivers to a worker.
You can use your existing MongoDB database as broker. see Using MongoDB.
For the application like this, its better use Django Background Tasks
,
Installation
Install from PyPI:
pip install django-background-tasks
Add to INSTALLED_APPS:
INSTALLED_APPS = (
# ...
'background_task',
# ...
)
Migrate your database:
python manage.py makemigrations background_task
python manage.py migrate
Creating and registering tasks
To register a task use the background decorator:
from background_task import background
from django.contrib.auth.models import User
#background(schedule=60)
def notify_user(user_id):
# lookup user by id and send them a message
user = User.objects.get(pk=user_id)
user.email_user('Here is a notification', 'You have been notified')
This will convert the notify_user into a background task function. When you call it from regular code it will actually create a Task object and stores it in the database. The database then contains serialised information about which function actually needs running later on. This does place limits on the parameters that can be passed when calling the function - they must all be serializable as JSON. Hence why in the example above a user_id is passed rather than a User object.
Calling notify_user as normal will schedule the original function to be run 60 seconds from now:
notify_user(user.id)
This is the default schedule time (as set in the decorator), but it can be overridden:
notify_user(user.id, schedule=90) # 90 seconds from now
notify_user(user.id, schedule=timedelta(minutes=20)) # 20 minutes from now
notify_user(user.id, schedule=timezone.now()) # at a specific time
Also you can run original function right now in synchronous mode:
notify_user.now(user.id) # launch a notify_user function and wait for it
notify_user = notify_user.now # revert task function back to normal function.
Useful for testing.
You can specify a verbose name and a creator when scheduling a task:
notify_user(user.id, verbose_name="Notify user", creator=user)

how to do with a request which needs to take a long time to run?

I'm using Django to develop a classifier service, and user can build a model using api like http://localhost/api/buildmodel, however, because building a model takes a long time, maybe 2 hours, and I'm using web page to show the result of building a model. How to design my Django program to return immediately and do something to show the result after building finish? maybe I can use ajax but I want to implement it in Python, like using a async method and calling a callback function after building, any suggestions will be appreciated.
You need to use a task queue manager. Celery is by far the most popular task manager for Django. The idea is that you give a task to this manager, then it processes the task and once it is done, it can fire a callback function. Within the callback function, you can you run your logic to notify the user that the task has been completed.
One way to do it is to create a row in a persistent database (or a redis key/value pair) for the task which says if it is running or finished. Have the code set the value to be running when the task starts and done when the task completes. Then have an AJAX call do a GET lookup on a URL that sends the status for the task via a web service. You can put that in a setInterval() to periodically poll the database to see if it is done. You could send an email on completion or just have a landing page / dashboard that shows the status of the tasks being run.

Categories