Is there any way to reload app.py automatically? - python

I made and published a Flask app on pythonanywhere. This is a single file app with data which is updated every 5 minutes. I need to update my dashboard every 5 minutes. After reloading it will update but it isn't possible to reload manually every 5 minutes. Is there any way to reload app.py using a script or changing some settings?

PythonAnywhere has an API that allows you to reload a website's code -- the help page I linked to explains all of the details of the API, and has instructions on how to set it up. Also, the example code that is provided on the API token setup page is exactly the Python code you would need to reload a website.
So you could write a script to do that, and then use the "Scheduled tasks" feature to run it every five minutes (you'd need to use twelve hourly tasks for that).
However, I'd recommend against writing your code so that the site needs to be reloaded in order to update. It would be better to make it serve up data directly from a database, and then to schedule tasks that run outside the context of the website to update that database every five minutes.

Yes, you can do that.
simply start the flask app in a new process then prepare a new function which will be executed after 5 min to restart the script
# coding: utf-8
import sys
import subprocess
from threading import Timer
from multiprocessing import Process
from flask import Flask
app = Flask(__name__)
def run():
# this is a block function so we must run it in a new process
app.run()
def run_after(p):
print("## Restarting ....")
# terminate the process
p.terminate()
## re-run the script. eg (python test.py blah blah)
args = [sys.executable] + [sys.argv[0]]
subprocess.call(args)
if __name__ == "__main__":
# run flask app in a new process
p = Process(target=run, args=())
p.start()
# set a threading timer which will be execute (run_after) function after 5 min
_timer = Timer(5 * 60, run_after, (p,))
_timer.start()

Related

SQLAlchemy queries running twice, only on a separate thread with long execution time

My application creates a Flask app as well as a background process that does work with my MySQL database (through SQLAlchemy) every so often:
from task_manager import TaskManager
# Session is a sessionmaker created earlier
task_manager = TaskManager(timedelta(seconds = 1), timedelta(seconds = 1), Session)
threading.Thread(target=task_manager.scheduler_loop).start()
app.run(debug=True, host='0.0.0.0', port=5000)
Whenever this process finds an available task (this is in the scheduler_loop that's running in the separate thread), it does some work:
with db_session(self.Session) as session:
task = session.query(Task).filter(or_(Task.date == None, Task.date <= datetime.now())).order_by(Task.priority).first()
if task is not None:
if task.type == "create_paper_task":
self.create_paper(session, task.paper_title)
elif task.type == "update_citations_task":
self.update_citations(session, task.paper)
session.delete(task)
...
def create_paper(self, session, paper_title):
...
# For the purposes of testing, I replaced a long API call with this sleep.
time.sleep(3)
paper = Paper(paper_title, year)
paper.citations.append(Citation(citations, datetime.now()))
session.add(paper)
If I try to use this code, the SQLAlchemy queries are run twice. Two Paper objects are created, and I get this error (presumably the Task being deleted twice):
/app/task_manager.py:17: SAWarning: DELETE statement on table 'create_paper_task' expected to delete 1 row(s); 0 were matched. Please set confirm_deleted_rows=False within the mapper configuration to prevent this warning.
The actual code itself isn't running twice, and there definitely aren't multiple scheduler threads running: I've tested this using print statements.
Now, the weirdest part about this is that the issue ONLY occurs when
There's a long wait during the execution. If the time.sleep is removed, there's no problem, and
The Flask app is running and the scheduler loop is running in a separate thread. If the Flask app isn't running, or the scheduler_loop is running in the main thread (so obviously the Flask app isn't running), then there's no problem.
Also, the Flask app isn't being used at all while I'm testing this, so that's not the issue.
The app.run function of Flask will run your initialization code twice when you set debug=True. This is part of the way Flask can detect code changes and dynamically restart as needed. The downside is that this is causing your thread to run twice which in turn creates a race condition on reading and executing your tasks, which indeed would only show up when the task takes long enough for the second thread to start working.
See this question/answer for more details about what is happening: Why does running the Flask dev server run itself twice?
To avoid this you could add code to avoid the second execution, but that has the limitation that the auto-reloading feature for modified code will no longer work. In general, it would probably be better to use something like Celery to handle task execution instead of building your own solution. However, as mentioned in the linked answer, you could use something like
from werkzeug.serving import is_running_from_reloader
if is_running_from_reloader():
from task_manager import TaskManager
task_manager = TaskManager(timedelta(seconds = 1), timedelta(seconds = 1), Session)
threading.Thread(target=task_manager.scheduler_loop).start()
which would keep your thread from being created unless you are in the second (reloaded) process. Note this would prevent your thread from executing at all if you remove debug=True.

Apscheduler calling function too quickly

Here is my scheduler.py file:
from apscheduler.schedulers.background import BackgroundScheduler
from django_apscheduler.jobstores import DjangoJobStore, register_events
from django.utils import timezone
from django_apscheduler.models import DjangoJobExecution
import sys
# This is the function you want to schedule - add as many as you want and then register them in the start() function below
def hello():
print("Hello")
def start():
scheduler = BackgroundScheduler()
scheduler.add_jobstore(DjangoJobStore(), "default")
# run this job every 10 seconds
scheduler.add_job(hello, 'interval', seconds=10, jobstore='default')
register_events(scheduler)
scheduler.start()
print("Scheduler started...", file=sys.stdout)
My Django app runs fine on localhost. I'm simply attempting to print 'hello' every 10 seconds in the terminal, but it sometimes prints like 3 or 4 at a time. Why is this? This was just a base template to help understand apscheduler.
The primary reason this might be happening is if you are running your development server without the --noreload flag set, which will cause the scheduler to be called twice (sometimes more).
When you run your server in development, try it like:
python manage.py runserver localhost:8000 --noreload
And see what happens. If it still keeps happening, it may be that the interval is too close together so by the time your system gets around to it, another version is still being called (even though it is a very short function). Django stores all pending, overdue, and run jobs in the database, so it has to store a record for the job after each transaction. Try expanding the interval and see what happens.
If neither of those things work, please post the rest of the code you are using and I will update my answer with other options. I've had a similar issue in the past, but it was solved with the --noreload option being set. When you run it in production behind a regular web server with DEBUG=False, then it should resolve itself as well.

Job is not performed by APScheduler's BackgroundScheduler

I have a Django application that I'm running on Docker. I'm trying to launch an APScheduler scheduler when I run the docker container.
I created a scheduler and I simply added it a job that I called test1, and that sends an email to my address.
This is the Python script that is launched when I run the container.
from apscheduler.schedulers.blocking import BlockingScheduler
from apscheduler.schedulers.background import BackgroundScheduler
#scheduler = BlockingScheduler()
scheduler = BackgroundScheduler()
def test1():
... (code to send email)
scheduler.add_job(test1, 'interval', seconds = 20)
scheduler.start()
This is the results I obtained with each of the two kind of schedulers:
BlockingScheduler: the scheduler works, I receive an email every 20 seconds. However I can't access the app. I presume this is normal due to the very nature of the BlockingScheduler.
screenshot1
screenshot2
BackgroundScheduler: no problem to access the application. However, I receive no email.
Since the emails were sent in one of the two cases I guess the problem is neither Django nor Docker related, but purely about APScheduler. I did my research but I couldn't find why the BackgroundScheduler didn't work as in the tutorials I read, the developper set up the scheduler the same way I did.
Any help would be much appreciated, thanks!
UPDATE 1
I tried the two following things, both made the BackgroundScheduler behave like a BlockingScheduler (which is not what I want)
1) Setting the daemon option to False when initialising the scheduler instance:
scheduler = BackgroundScheduler(daemon = False)
2) "Trying to keep the main thread alive", as explained in these:
how-do-i-schedule-an-interval-job-with-apscheduler
apscheduler-inside-a-class-object
I added this right after scheduler.starts():
while True:
time.sleep(1)
scheduler.shutdown()
UPDATE 2
When I try to setup a BackgroundScheduler in a single Python file (outside of any application context), it works very well:
from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.schedulers.blocking import BlockingScheduler
def test1():
print('issou')
scheduler = BackgroundScheduler()
scheduler.start()
scheduler.add_job(test1, 'interval', seconds=5)
print('yatangaki')
'yatangaki' is first printed, and then 'issou' every 5 seconds, so everything seems fine.
UPDATE 3
Now I've tried to start the scheduler on a Django app that I ran locally with python manage.py runserver, without using Docker.
It works perfectly: the emails are sent and I can access the main view of the application.
Note: the BackgroundScheduler is started by a function called start_test1. In this app, I run start_test1 in the top-level urls.py file. On the other app - the one that I run with Docker, which is the one I want to use in the end - start_test1 is started in a Python script, that is itself triggered in a .sh file, which I run via the CMD Docker command.
It appears it was all about where to start the scheduler and to add the job.
In what I did initially (putting the code in a .sh file), the BackgroundScheduler started but the Python script immediately ended after being ran, as it didn't have a blocking behaviour and the sh. file wasn't really part of the app (it's used by the Dockerfile, not by the app).
I ended up finding the solution here:
execute-code-when-django-starts-once-only
There was no apps.py file inside my application so I created one and followed the instruction in this thread.
It works fine now.
I have a similar problem but it got solved only by not blocking the worker:
# working:
def worker():
phone_elm = ....
thread = threading.Thread(target=work, args=(phone_elm,))
thread.start()
# not working:
def worker():
phone_elm = ....
work(phone_elm)
scheduler2 = BackgroundScheduler(timezone="Asia/Kolkata")
# schedule scanning running of folder's running file
scheduler2.add_job(worker, 'interval', seconds=15, max_instances=5000)
scheduler1.start()
I mean the not working, is that after 10 ~ time it triggered it was stopped for no reason, started again after 1 of the 10 stopped (work exited)
It is also part of Django, but it is clear that start is not exiting...

How to schedule a single-time event in Django in Heroku?

I have a question about the software design necessary to schedule an event that is going to be triggered once in the future in Heroku's distributed environment.
I believe it's better to write what I want to achieve, but I have certainly done my research and could not figure it out myself even after two hours of work.
Let's say in my views.py I have a function:
def after_6_hours():
print('6 hours passed.')
def create_game():
print('Game created')
# of course time will be error, but that's just an example
scheduler.do(after_6_hours, time=now + 6)
so what I want to achieve is to be able to run after_6_hours function exactly 6 hours after create_game has been invoked. Now, as you can see, this function is defined out of the usual clock.py or task.py or etc etc files.
Now, how can I have my whole application running in Heroku, all the time, and be able to add this job into the queue of this imaginary-for-now-scheduler library?
On a side note, I can't use Temporizer add-on of Heroku. The combination of APScheduler and Python rq looked promising, but examples are trivial, all scheduled on the same file within clock.py, and I just simply don't know how to tie everything together with the setup I have. Thanks in advance!
In Heroku you can have your Django application running in a Web Dyno, which will be responsible to serve your application and also to schedule the tasks.
For example (Please note that I did not test run the code):
Create after_hours.py, which will have the function you are going to schedule (note that we are going to use the same source code in worker too).
def after_6_hours():
print('6 hours passed.')
in your views.py using rq (note that rq alone is not enough in your situation as you have to schedule the task) and rq-scheduler:
from redis import Redis
from rq_scheduler import Scheduler
from datetime import timedelta
from after_hours import after_6_hours
def create_game():
print('Game created')
scheduler = Scheduler(connection=Redis()) # Get a scheduler for the "default" queue
scheduler.enqueue_in(timedelta(hours=6), after_6_hours) #schedules the job to run 6 hours later.
Calling create_game() should schedule after_6_hours() to run 6 hours later.
Hint: You can provision Redis in Heroku using Redis To Go add-on.
Next step is to run rqscheduler tool, which polls Redis every minute to see if there is any job to be executed at that time and places it in the queue(to which rq workers will be listening to).
Now, in a Worker Dyno create a file after_hours.py
def after_6_hours():
print('6 hours passed.')
#Better return something
And create another file worker.py:
import os
import redis
from rq import Worker, Queue, Connection
from after_hours import after_6_hours
listen = ['high', 'default', 'low'] # while scheduling the task in views.py we sent it to default
redis_url = os.getenv('REDISTOGO_URL', 'redis://localhost:6379')
conn = redis.from_url(redis_url)
if __name__ == '__main__':
with Connection(conn):
worker = Worker(map(Queue, listen))
worker.work()
and run this worker.py
python worker.py
That should run the scheduled task(afer_6_hours in this case) in Worker Dyno.
Please note that the key here is to make the same source code (after_hours.py in this case) available to worker too. The same is emphasized in rq docs
Make sure that the worker and the work generator share exactly the
same source code.
If it helps, there is a hint in the docs to deal with different code bases.
For cases where the web process doesn't have access to the source code
running in the worker (i.e. code base X invokes a delayed function
from code base Y), you can pass the function as a string reference,
too.
q = Queue('low', connection=redis_conn)
q.enqueue('my_package.my_module.my_func', 3, 4)
Hopefully rq-scheduler too respects this way of passing string instead of function object.
You can use any module/scheduling tool (Celery/RabbitMQ, APScheduler etc) as long as you understand this thing.

How to perform periodic task with Flask in Python

I've been using Flask to provide a simple web API for my k8055 USB interface board; fairly standard getters and putters, and Flask really made my life a lot easier.
But I want to be able to register changes of state as / near when whey happen.
For instance, if I have a button connected to the board, I can poll the api for that particular port. But if I wanted to have the outputs directly reflect the outputs, whether or not someone was talking to the api, I would have something like this.
while True:
board.read()
board.digital_outputs = board.digital_inputs
board.read()
time.sleep(1)
And every second, the outputs would be updated to match the inputs.
Is there any way to do this kind of thing under Flask? I've done similar things in Twisted before but Flask is too handy for this particular application to give up on it just yet...
Thanks.
For my Flask application, I contemplated using the cron approach described by Pashka in his answer, the schedule library, and APScheduler.
I found APScheduler to be simple and serving the periodic task run purpose, so went ahead with APScheduler.
Example code:
from flask import Flask
from apscheduler.schedulers.background import BackgroundScheduler
app = Flask(__name__)
def test_job():
print('I am working...')
scheduler = BackgroundScheduler()
job = scheduler.add_job(test_job, 'interval', minutes=1)
scheduler.start()
You could use cron for simple tasks.
Create a flask view for your task.
# a separate view for periodic task
#app.route('/task')
def task():
board.read()
board.digital_outputs = board.digital_inputs
Then using cron, download from that url periodically
# cron task to run each minute
0-59 * * * * run_task.sh
Where run_task.sh contents are
wget http://localhost/task
Cron is unable to run more frequently than once a minute. If you need higher frequency, (say, each 5 seconds = 12 times per minute), you must do it in tun_task.sh in the following way
# loop 12 times with a delay
for i in 1 2 3 4 5 6 7 8 9 10 11 12
do
# download url in background for not to affect delay interval much
wget -b http://localhost/task
sleep 5s
done
For some reason, Antony's code wasn't working for me. I didn't get any error messages or anything, but the test_job function wouldn't run.
I was able to get it working by installing Flask-APScheduler and then using the following code, which is a blend of Antony's code and the example from this Techcoil article.
from flask import Flask
from flask_apscheduler import APScheduler
app = Flask(__name__)
def test_job():
print('I am working...')
scheduler = APScheduler()
scheduler.init_app(app)
scheduler.start()
scheduler.add_job(id='test-job', func=test_job, trigger='interval', seconds=1)
No there is not tasks support in Flask, but you can use flask-celery or simply run your function in separate thread(greenlet).

Categories