Job is not performed by APScheduler's BackgroundScheduler - python

I have a Django application that I'm running on Docker. I'm trying to launch an APScheduler scheduler when I run the docker container.
I created a scheduler and I simply added it a job that I called test1, and that sends an email to my address.
This is the Python script that is launched when I run the container.
from apscheduler.schedulers.blocking import BlockingScheduler
from apscheduler.schedulers.background import BackgroundScheduler
#scheduler = BlockingScheduler()
scheduler = BackgroundScheduler()
def test1():
... (code to send email)
scheduler.add_job(test1, 'interval', seconds = 20)
scheduler.start()
This is the results I obtained with each of the two kind of schedulers:
BlockingScheduler: the scheduler works, I receive an email every 20 seconds. However I can't access the app. I presume this is normal due to the very nature of the BlockingScheduler.
screenshot1
screenshot2
BackgroundScheduler: no problem to access the application. However, I receive no email.
Since the emails were sent in one of the two cases I guess the problem is neither Django nor Docker related, but purely about APScheduler. I did my research but I couldn't find why the BackgroundScheduler didn't work as in the tutorials I read, the developper set up the scheduler the same way I did.
Any help would be much appreciated, thanks!
UPDATE 1
I tried the two following things, both made the BackgroundScheduler behave like a BlockingScheduler (which is not what I want)
1) Setting the daemon option to False when initialising the scheduler instance:
scheduler = BackgroundScheduler(daemon = False)
2) "Trying to keep the main thread alive", as explained in these:
how-do-i-schedule-an-interval-job-with-apscheduler
apscheduler-inside-a-class-object
I added this right after scheduler.starts():
while True:
time.sleep(1)
scheduler.shutdown()
UPDATE 2
When I try to setup a BackgroundScheduler in a single Python file (outside of any application context), it works very well:
from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.schedulers.blocking import BlockingScheduler
def test1():
print('issou')
scheduler = BackgroundScheduler()
scheduler.start()
scheduler.add_job(test1, 'interval', seconds=5)
print('yatangaki')
'yatangaki' is first printed, and then 'issou' every 5 seconds, so everything seems fine.
UPDATE 3
Now I've tried to start the scheduler on a Django app that I ran locally with python manage.py runserver, without using Docker.
It works perfectly: the emails are sent and I can access the main view of the application.
Note: the BackgroundScheduler is started by a function called start_test1. In this app, I run start_test1 in the top-level urls.py file. On the other app - the one that I run with Docker, which is the one I want to use in the end - start_test1 is started in a Python script, that is itself triggered in a .sh file, which I run via the CMD Docker command.

It appears it was all about where to start the scheduler and to add the job.
In what I did initially (putting the code in a .sh file), the BackgroundScheduler started but the Python script immediately ended after being ran, as it didn't have a blocking behaviour and the sh. file wasn't really part of the app (it's used by the Dockerfile, not by the app).
I ended up finding the solution here:
execute-code-when-django-starts-once-only
There was no apps.py file inside my application so I created one and followed the instruction in this thread.
It works fine now.

I have a similar problem but it got solved only by not blocking the worker:
# working:
def worker():
phone_elm = ....
thread = threading.Thread(target=work, args=(phone_elm,))
thread.start()
# not working:
def worker():
phone_elm = ....
work(phone_elm)
scheduler2 = BackgroundScheduler(timezone="Asia/Kolkata")
# schedule scanning running of folder's running file
scheduler2.add_job(worker, 'interval', seconds=15, max_instances=5000)
scheduler1.start()
I mean the not working, is that after 10 ~ time it triggered it was stopped for no reason, started again after 1 of the 10 stopped (work exited)
It is also part of Django, but it is clear that start is not exiting...

Related

Apscheduler calling function too quickly

Here is my scheduler.py file:
from apscheduler.schedulers.background import BackgroundScheduler
from django_apscheduler.jobstores import DjangoJobStore, register_events
from django.utils import timezone
from django_apscheduler.models import DjangoJobExecution
import sys
# This is the function you want to schedule - add as many as you want and then register them in the start() function below
def hello():
print("Hello")
def start():
scheduler = BackgroundScheduler()
scheduler.add_jobstore(DjangoJobStore(), "default")
# run this job every 10 seconds
scheduler.add_job(hello, 'interval', seconds=10, jobstore='default')
register_events(scheduler)
scheduler.start()
print("Scheduler started...", file=sys.stdout)
My Django app runs fine on localhost. I'm simply attempting to print 'hello' every 10 seconds in the terminal, but it sometimes prints like 3 or 4 at a time. Why is this? This was just a base template to help understand apscheduler.
The primary reason this might be happening is if you are running your development server without the --noreload flag set, which will cause the scheduler to be called twice (sometimes more).
When you run your server in development, try it like:
python manage.py runserver localhost:8000 --noreload
And see what happens. If it still keeps happening, it may be that the interval is too close together so by the time your system gets around to it, another version is still being called (even though it is a very short function). Django stores all pending, overdue, and run jobs in the database, so it has to store a record for the job after each transaction. Try expanding the interval and see what happens.
If neither of those things work, please post the rest of the code you are using and I will update my answer with other options. I've had a similar issue in the past, but it was solved with the --noreload option being set. When you run it in production behind a regular web server with DEBUG=False, then it should resolve itself as well.

Is there any way to reload app.py automatically?

I made and published a Flask app on pythonanywhere. This is a single file app with data which is updated every 5 minutes. I need to update my dashboard every 5 minutes. After reloading it will update but it isn't possible to reload manually every 5 minutes. Is there any way to reload app.py using a script or changing some settings?
PythonAnywhere has an API that allows you to reload a website's code -- the help page I linked to explains all of the details of the API, and has instructions on how to set it up. Also, the example code that is provided on the API token setup page is exactly the Python code you would need to reload a website.
So you could write a script to do that, and then use the "Scheduled tasks" feature to run it every five minutes (you'd need to use twelve hourly tasks for that).
However, I'd recommend against writing your code so that the site needs to be reloaded in order to update. It would be better to make it serve up data directly from a database, and then to schedule tasks that run outside the context of the website to update that database every five minutes.
Yes, you can do that.
simply start the flask app in a new process then prepare a new function which will be executed after 5 min to restart the script
# coding: utf-8
import sys
import subprocess
from threading import Timer
from multiprocessing import Process
from flask import Flask
app = Flask(__name__)
def run():
# this is a block function so we must run it in a new process
app.run()
def run_after(p):
print("## Restarting ....")
# terminate the process
p.terminate()
## re-run the script. eg (python test.py blah blah)
args = [sys.executable] + [sys.argv[0]]
subprocess.call(args)
if __name__ == "__main__":
# run flask app in a new process
p = Process(target=run, args=())
p.start()
# set a threading timer which will be execute (run_after) function after 5 min
_timer = Timer(5 * 60, run_after, (p,))
_timer.start()

How to schedule a single-time event in Django in Heroku?

I have a question about the software design necessary to schedule an event that is going to be triggered once in the future in Heroku's distributed environment.
I believe it's better to write what I want to achieve, but I have certainly done my research and could not figure it out myself even after two hours of work.
Let's say in my views.py I have a function:
def after_6_hours():
print('6 hours passed.')
def create_game():
print('Game created')
# of course time will be error, but that's just an example
scheduler.do(after_6_hours, time=now + 6)
so what I want to achieve is to be able to run after_6_hours function exactly 6 hours after create_game has been invoked. Now, as you can see, this function is defined out of the usual clock.py or task.py or etc etc files.
Now, how can I have my whole application running in Heroku, all the time, and be able to add this job into the queue of this imaginary-for-now-scheduler library?
On a side note, I can't use Temporizer add-on of Heroku. The combination of APScheduler and Python rq looked promising, but examples are trivial, all scheduled on the same file within clock.py, and I just simply don't know how to tie everything together with the setup I have. Thanks in advance!
In Heroku you can have your Django application running in a Web Dyno, which will be responsible to serve your application and also to schedule the tasks.
For example (Please note that I did not test run the code):
Create after_hours.py, which will have the function you are going to schedule (note that we are going to use the same source code in worker too).
def after_6_hours():
print('6 hours passed.')
in your views.py using rq (note that rq alone is not enough in your situation as you have to schedule the task) and rq-scheduler:
from redis import Redis
from rq_scheduler import Scheduler
from datetime import timedelta
from after_hours import after_6_hours
def create_game():
print('Game created')
scheduler = Scheduler(connection=Redis()) # Get a scheduler for the "default" queue
scheduler.enqueue_in(timedelta(hours=6), after_6_hours) #schedules the job to run 6 hours later.
Calling create_game() should schedule after_6_hours() to run 6 hours later.
Hint: You can provision Redis in Heroku using Redis To Go add-on.
Next step is to run rqscheduler tool, which polls Redis every minute to see if there is any job to be executed at that time and places it in the queue(to which rq workers will be listening to).
Now, in a Worker Dyno create a file after_hours.py
def after_6_hours():
print('6 hours passed.')
#Better return something
And create another file worker.py:
import os
import redis
from rq import Worker, Queue, Connection
from after_hours import after_6_hours
listen = ['high', 'default', 'low'] # while scheduling the task in views.py we sent it to default
redis_url = os.getenv('REDISTOGO_URL', 'redis://localhost:6379')
conn = redis.from_url(redis_url)
if __name__ == '__main__':
with Connection(conn):
worker = Worker(map(Queue, listen))
worker.work()
and run this worker.py
python worker.py
That should run the scheduled task(afer_6_hours in this case) in Worker Dyno.
Please note that the key here is to make the same source code (after_hours.py in this case) available to worker too. The same is emphasized in rq docs
Make sure that the worker and the work generator share exactly the
same source code.
If it helps, there is a hint in the docs to deal with different code bases.
For cases where the web process doesn't have access to the source code
running in the worker (i.e. code base X invokes a delayed function
from code base Y), you can pass the function as a string reference,
too.
q = Queue('low', connection=redis_conn)
q.enqueue('my_package.my_module.my_func', 3, 4)
Hopefully rq-scheduler too respects this way of passing string instead of function object.
You can use any module/scheduling tool (Celery/RabbitMQ, APScheduler etc) as long as you understand this thing.

Web2Py - configure a scheduler

I have an application written in Web2Py that contains some modules. I need to call some functions out of a module on a periodic basis, say once daily. I have been trying to get a scheduler working for that purpose but am not sure how to get it working properly. I have referred to this and this to get started.
I have got a scheduler.py class in the models directory, which contains code like this:
from gluon.scheduler import Scheduler
from Module1 import Module1
def daily_task():
module1 = Module1()
module1.action1(arg1, arg2, arg3)
daily_task_scheduler = Scheduler(db, tasks=dict(my_daily_task=daily_task))
In default.py I have following code for the scheduler:
def daily_periodic_task():
daily_task_scheduler.queue_task('daily_running_task', repeats=0, period=60)
[for testing I am running it after 60 seconds, otherwise for daily I plan to use period=86400]
In my Module1.py class, I have this kind of code:
def action1(self, arg1, arg2, arg3):
for row in db().select(db.table1.ALL):
row.processed = 'processed'
row.update_record()
One of the issues I am facing is that I don't understand clearly how to make this scheduler work to automatically handle the execution of action1 on daily basis.
When I launch my application using syntax similar to: python web2py.py -K my_app it shows this in the console:
web2py Web Framework
Created by Massimo Di Pierro, Copyright 2007-2015
Version 2.11.2-stable+timestamp.2015.05.30.16.33.24
Database drivers available: sqlite3, imaplib, pyodbc, pymysql, pg8000
starting single-scheduler for "my_app"...
However, when I see the browser at:
http://127.0.0.1:8000/my_app/default/daily_periodic_task
I just see "None" as text displayed on the screen and I don't see any changes produced by the scheduled task in my database table.
While when I see the browser at:
http://127.0.0.1:8000/my_app/default/index
I get an error stating This web page is not available, basically indicating my application never got started.
When I start my application normally using python web2py.py my application loads fine but I don't see any changes produced by the scheduled task in my database table.
I am unable to figure out what I am doing wrong here and how to properly use the scheduler with Web2Py. Basically, I need to know how can I start my application normally alongwith the scheduled tasks properly running in background.
Any help in this regard would be highly appreciated.
Running python web2py.py starts the built-in web server, enabling web2py to respond to HTTP requests (i.e., serving web pages to a browser). This has nothing to do with the scheduler and will not result in any scheduled tasks being run.
To run scheduled tasks, you must start one or more background workers via:
python web2py.py -K myapp
The above does not start the built-in web server and therefore does not enable you to visit web pages. It simply starts a worker process that will be available to execute scheduled tasks.
Also, note that the above does not actually result in any tasks being scheduled. To schedule a task, you must insert a record in the db.scheduler_task table, which you can do via any of the usual methods of inserting records (including using appadmin) or programmatically via the scheduler.queue_task method (which is what you use in your daily_periodic_task action).
Note, you can simultaneously start the built-in web server and a scheduler worker process via:
python web2py.py -a yourpassword -K myapp -X
So, to schedule a daily task and have it actually executed, you need to (a) start a scheduler worker and (b) schedule the task. You can schedule the task by visiting your daily_periodic_task action, but note that you only need to visit that action once, as once the task has been scheduled, it remains in effect indefinitely (given that you have set repeats=0).
If the task does not appear to be working, it is possible there is something wrong with the task itself that is resulting in an error.

How to perform periodic task with Flask in Python

I've been using Flask to provide a simple web API for my k8055 USB interface board; fairly standard getters and putters, and Flask really made my life a lot easier.
But I want to be able to register changes of state as / near when whey happen.
For instance, if I have a button connected to the board, I can poll the api for that particular port. But if I wanted to have the outputs directly reflect the outputs, whether or not someone was talking to the api, I would have something like this.
while True:
board.read()
board.digital_outputs = board.digital_inputs
board.read()
time.sleep(1)
And every second, the outputs would be updated to match the inputs.
Is there any way to do this kind of thing under Flask? I've done similar things in Twisted before but Flask is too handy for this particular application to give up on it just yet...
Thanks.
For my Flask application, I contemplated using the cron approach described by Pashka in his answer, the schedule library, and APScheduler.
I found APScheduler to be simple and serving the periodic task run purpose, so went ahead with APScheduler.
Example code:
from flask import Flask
from apscheduler.schedulers.background import BackgroundScheduler
app = Flask(__name__)
def test_job():
print('I am working...')
scheduler = BackgroundScheduler()
job = scheduler.add_job(test_job, 'interval', minutes=1)
scheduler.start()
You could use cron for simple tasks.
Create a flask view for your task.
# a separate view for periodic task
#app.route('/task')
def task():
board.read()
board.digital_outputs = board.digital_inputs
Then using cron, download from that url periodically
# cron task to run each minute
0-59 * * * * run_task.sh
Where run_task.sh contents are
wget http://localhost/task
Cron is unable to run more frequently than once a minute. If you need higher frequency, (say, each 5 seconds = 12 times per minute), you must do it in tun_task.sh in the following way
# loop 12 times with a delay
for i in 1 2 3 4 5 6 7 8 9 10 11 12
do
# download url in background for not to affect delay interval much
wget -b http://localhost/task
sleep 5s
done
For some reason, Antony's code wasn't working for me. I didn't get any error messages or anything, but the test_job function wouldn't run.
I was able to get it working by installing Flask-APScheduler and then using the following code, which is a blend of Antony's code and the example from this Techcoil article.
from flask import Flask
from flask_apscheduler import APScheduler
app = Flask(__name__)
def test_job():
print('I am working...')
scheduler = APScheduler()
scheduler.init_app(app)
scheduler.start()
scheduler.add_job(id='test-job', func=test_job, trigger='interval', seconds=1)
No there is not tasks support in Flask, but you can use flask-celery or simply run your function in separate thread(greenlet).

Categories