RQ scheduler sending multiple emails - python

I am using Django RQ scheduler
scheduled_tasks.py
from redis import Redis
from rq_scheduler import Scheduler
from datetime import datetime
scheduler = Scheduler(connection=Redis()) # Get a scheduler for the "default" queue
# scheduler = django_rq.get_scheduler("default")
now = datetime.now()
start = now.replace(hour=8, minute=00, second=0, microsecond=0)
scheduler.schedule(
scheduled_time=start, # Time for first execution, in UTC timezone
func=broadcast_approved_jobs, # Function to be queued
interval=86400 # Time before the function is called again, in seconds
repeat=None # Repeat this number of times (None means repeat forever)
)
I need to run this scheduler only once in a day.
But its sending mails repeatedly. I think this scheduler is calling broadcast_approved_jobs multiple times. Any idea why?

(take 2)
This is the function I'm using, not written by me, but I forget where I found it. Even if your scheduler is being called multiple times, it will at least remove any existing jobs.
def schedule_once(scheduled_time, func, args=None, kwargs=None,
interval=None, repeat=None, result_ttl=None, timeout=None, queue_name=None):
"""
Schedule job once or reschedule when interval changes
"""
if not func in functions or not interval in functions[func] \
or len(functions[func]) > 1:
# clear all scheduled jobs for this function
map(scheduler.cancel, filter(lambda x: x.func == func, jobs))
# schedule with new interval
scheduler.schedule(scheduled_time, func, interval=interval, repeat=repeat)
schedule_once(
scheduled_time=datetime.utcnow(), # Time for first execution, in UTC timezone
func=readmail, # Function to be queued
interval=120, # Time before the function is called again, in seconds
repeat=0 # Repeat this number of times (None means repeat forever)
)

Related

Python script scheduling [duplicate]

Before I ask, Cron Jobs and Task Scheduler will be my last options, this script will be used across Windows and Linux and I'd prefer to have a coded out method of doing this than leaving this to the end user to complete.
Is there a library for Python that I can use to schedule tasks? I will need to run a function once every hour, however, over time if I run a script once every hour and use .sleep, "once every hour" will run at a different part of the hour from the previous day due to the delay inherent to executing/running the script and/or function.
What is the best way to schedule a function to run at a specific time of day (more than once) without using a Cron Job or scheduling it with Task Scheduler?
Or if this is not possible, I would like your input as well.
AP Scheduler fit my needs exactly.
Version < 3.0
import datetime
import time
from apscheduler.scheduler import Scheduler
# Start the scheduler
sched = Scheduler()
sched.daemonic = False
sched.start()
def job_function():
print("Hello World")
print(datetime.datetime.now())
time.sleep(20)
# Schedules job_function to be run once each minute
sched.add_cron_job(job_function, minute='0-59')
out:
>Hello World
>2014-03-28 09:44:00.016.492
>Hello World
>2014-03-28 09:45:00.0.14110
Version > 3.0
(From Animesh Pandey's answer below)
from apscheduler.schedulers.blocking import BlockingScheduler
sched = BlockingScheduler()
#sched.scheduled_job('interval', seconds=10)
def timed_job():
print('This job is run every 10 seconds.')
#sched.scheduled_job('cron', day_of_week='mon-fri', hour=10)
def scheduled_job():
print('This job is run every weekday at 10am.')
sched.configure(options_from_ini_file)
sched.start()
Maybe this can help: Advanced Python Scheduler
Here's a small piece of code from their documentation:
from apscheduler.schedulers.blocking import BlockingScheduler
def some_job():
print "Decorated job"
scheduler = BlockingScheduler()
scheduler.add_job(some_job, 'interval', hours=1)
scheduler.start()
To run something every 10 minutes past the hour.
from datetime import datetime, timedelta
while 1:
print 'Run something..'
dt = datetime.now() + timedelta(hours=1)
dt = dt.replace(minute=10)
while datetime.now() < dt:
time.sleep(1)
For apscheduler < 3.0, see Unknown's answer.
For apscheduler > 3.0
from apscheduler.schedulers.blocking import BlockingScheduler
sched = BlockingScheduler()
#sched.scheduled_job('interval', seconds=10)
def timed_job():
print('This job is run every 10 seconds.')
#sched.scheduled_job('cron', day_of_week='mon-fri', hour=10)
def scheduled_job():
print('This job is run every weekday at 10am.')
sched.configure(options_from_ini_file)
sched.start()
Update:
apscheduler documentation.
This for apscheduler-3.3.1 on Python 3.6.2.
"""
Following configurations are set for the scheduler:
- a MongoDBJobStore named “mongo”
- an SQLAlchemyJobStore named “default” (using SQLite)
- a ThreadPoolExecutor named “default”, with a worker count of 20
- a ProcessPoolExecutor named “processpool”, with a worker count of 5
- UTC as the scheduler’s timezone
- coalescing turned off for new jobs by default
- a default maximum instance limit of 3 for new jobs
"""
from pytz import utc
from apscheduler.schedulers.blocking import BlockingScheduler
from apscheduler.jobstores.sqlalchemy import SQLAlchemyJobStore
from apscheduler.executors.pool import ProcessPoolExecutor
"""
Method 1:
"""
jobstores = {
'mongo': {'type': 'mongodb'},
'default': SQLAlchemyJobStore(url='sqlite:///jobs.sqlite')
}
executors = {
'default': {'type': 'threadpool', 'max_workers': 20},
'processpool': ProcessPoolExecutor(max_workers=5)
}
job_defaults = {
'coalesce': False,
'max_instances': 3
}
"""
Method 2 (ini format):
"""
gconfig = {
'apscheduler.jobstores.mongo': {
'type': 'mongodb'
},
'apscheduler.jobstores.default': {
'type': 'sqlalchemy',
'url': 'sqlite:///jobs.sqlite'
},
'apscheduler.executors.default': {
'class': 'apscheduler.executors.pool:ThreadPoolExecutor',
'max_workers': '20'
},
'apscheduler.executors.processpool': {
'type': 'processpool',
'max_workers': '5'
},
'apscheduler.job_defaults.coalesce': 'false',
'apscheduler.job_defaults.max_instances': '3',
'apscheduler.timezone': 'UTC',
}
sched_method1 = BlockingScheduler() # uses overrides from Method1
sched_method2 = BlockingScheduler() # uses same overrides from Method2 but in an ini format
#sched_method1.scheduled_job('interval', seconds=10)
def timed_job():
print('This job is run every 10 seconds.')
#sched_method2.scheduled_job('cron', day_of_week='mon-fri', hour=10)
def scheduled_job():
print('This job is run every weekday at 10am.')
sched_method1.configure(jobstores=jobstores, executors=executors, job_defaults=job_defaults, timezone=utc)
sched_method1.start()
sched_method2.configure(gconfig=gconfig)
sched_method2.start()
the simplest option I can suggest is using the schedule library.
In your question, you said "I will need to run a function once every hour"
the code to do this is very simple:
import schedule
def thing_you_wanna_do():
...
...
return
schedule.every().hour.do(thing_you_wanna_do)
while True:
schedule.run_pending()
you also asked how to do something at a certain time of the day
some examples of how to do this are:
import schedule
def thing_you_wanna_do():
...
...
return
schedule.every().day.at("10:30").do(thing_you_wanna_do)
schedule.every().monday.do(thing_you_wanna_do)
schedule.every().wednesday.at("13:15").do(thing_you_wanna_do)
# If you would like some randomness / variation you could also do something like this
schedule.every(1).to(2).hours.do(thing_you_wanna_do)
while True:
schedule.run_pending()
90% of the code used is the example code of the schedule library. Happy scheduling!
Run the script every 15 minutes of the hour.
For example, you want to receive 15 minute stock price quotes, which are updated every 15 minutes.
while True:
print("Update data:", datetime.now())
sleep = 15 - datetime.now().minute % 15
if sleep == 15:
run_strategy()
time.sleep(sleep * 60)
else:
time.sleep(sleep * 60)
#For scheduling task execution
import schedule
import time
def job():
print("I'm working...")
schedule.every(1).minutes.do(job)
#schedule.every().hour.do(job)
#schedule.every().day.at("10:30").do(job)
#schedule.every(5).to(10).minutes.do(job)
#schedule.every().monday.do(job)
#schedule.every().wednesday.at("13:15").do(job)
#schedule.every().minute.at(":17").do(job)
while True:
schedule.run_pending()
time.sleep(1)
The Python standard library does provide sched and threading for this task. But this means your scheduler script will have be running all the time instead of leaving its execution to the OS, which may or may not be what you want.
On the version posted by sunshinekitty called "Version < 3.0" , you may need to specify apscheduler 2.1.2 . I accidentally had version 3 on my 2.7 install, so I went:
pip uninstall apscheduler
pip install apscheduler==2.1.2
It worked correctly after that. Hope that helps.
clock.py
from apscheduler.schedulers.blocking import BlockingScheduler
import pytz
sched = BlockingScheduler(timezone=pytz.timezone('Africa/Lagos'))
#sched.scheduled_job('cron', day_of_week='mon-sun', hour=22)
def scheduled_job():
print('This job is run every week at 10pm.')
#your job here
sched.start()
Procfile
clock: python clock.py
requirements.txt
APScheduler==3.0.0
After deployment, the final step is to scale up the clock process. This is a singleton process, meaning you’ll never need to scale up more than 1 of these processes. If you run two, the work will be duplicated.
$ heroku ps:scale clock=1
Source: https://devcenter.heroku.com/articles/clock-processes-python
Perhaps Rocketry suits your needs. It's a powerful scheduler that is very easy to use, has a lot of built-in scheduling options and it is easy to extend:
from rocketry import Rocketry
from rocketry.conds import daily, every, after_success
app = Rocketry()
#app.task(every("1 hour 30 minutes"))
def do_things():
...
#app.task(daily.between("12:00", "17:00"))
def do_daily_afternoon():
...
#app.task(daily & after_success(do_things))
def do_daily_after_task():
...
if __name__ == "__main__":
app.run()
It has much more though:
String based scheduling syntax
Logical statements (AND, OR, NOT)
A lot of built-in scheduling options
Easy to customize (custom conditions, parameters etc.)
Parallelization (run on separate thread or process)
Paramatrization (execution order and input-output)
Persistence: put the logs anywhere you like
Modify scheduler on runtime (ie. build API on top of it)
Links:
Documentation: https://rocketry.readthedocs.io/
Source code: https://github.com/Miksus/rocketry
Disclaimer: I'm the author
Probably you got the solution already #lukik, but if you wanna remove a scheduling, you should use:
job = scheduler.add_job(myfunc, 'interval', minutes=2)
job.remove()
or
scheduler.add_job(myfunc, 'interval', minutes=2, id='my_job_id')
scheduler.remove_job('my_job_id')
if you need to use a explicit job ID
For more information, you should check: https://apscheduler.readthedocs.io/en/stable/userguide.html#removing-jobs
I found that scheduler needs to run the program every second. If using a online server it would be costly.
So I have following:
It run at each minute at the 5th second, and you can change it to hours days by recalculating waiting period in seconds
import time
import datetime
Initiating = True
print(datetime.datetime.now())
while True:
if Initiating == True:
print("Initiate")
print( datetime.datetime.now())
time.sleep(60 - time.time() % 60+5)
Initiating = False
else:
time.sleep(60)
print("working")
print(datetime.datetime.now())
This method worked for me using relativedelta and datetime and a modulo boolean check for every hour.
It runs every hour from the time you start it.
import time
from datetime import datetime, timedelta
from dateutil.relativedelta import relativedelta
#Track next run outside loop and update the next run time within the loop
nxt_run=datetime.now()
#because while loops evaluate at microseconds we basically need to use a boolean evaluation to track when it should run next
while True:
cnow = datetime.now() #track the current time
time.sleep(1) #good to have so cpu doesn't spike
if (cnow.hour % 1 == 0 and cnow >= nxt_run):
print(f"start #{cnow}: next run #{nxt_run}")
nxt_run=cnow+relativedelta(hours=1) #add an hour to the next run
else:
print(f"next run #{nxt_run}")
One option is to write a C/C++ wrapper that executes the python script on a regular basis. Your end-user would run the C/C++ executable, which would remain running in the background, and periodically execute the python script. This may not be the best solution, and may not work if you don't know C/C++ or want to keep this 100% python. But it does seem like the most user-friendly approach, since people are used to clicking on executables. All of this assumes that python is installed on your end user's computer.
Another option is to use cron job/Task Scheduler but to put it in the installer as a script so your end user doesn't have to do it.

How to check different run times of a task in a DAG in an External Sensor Airflow

Imagine I have a DAG A including some tasks, and these tasks depend on some external sensor on another task in DAG B.
For example I want to check the state of a task in DAG B on 10:00, and if this run is succeeded, then the tasks in DAG A can run.
But now because of a reason, the task in DAG B on 10:00 is failed, but the run of the same task on 11:00 is succeeded.
The problem is the tasks in DAG A will pend forever because the task in DAG B failed at 10:00. But it's ok if the next run has run successfully.
How can I implement such a thing in external sensor airflow that check the state of the next run time in another DAG and if it's succeeded, then my tasks can run without a problem?
P.S: because of some reasons I can't use retry!
Thank you in advance.
I found a solution myself. It maybe not the best approach, but it works.
In this case we can define a function for on_failure_callback and set a timeout for our ExternalSensor, and when the timeout is reached, we check the next run of the task in the other DAG and if it is succeeded, we set the status of our ExternalSensor to SUCCESS so the other tasks that are depended on this sensor can run without a problem.
This is the code for this approach:
from airflow.utils.state import State
from airflow.sensors.external_task_sensor import ExternalTaskSensor
from airflow.exceptions import AirflowSensorTimeout
from datetime import datetime, timedelta, timezone
from airflow.api.common.experimental.get_task_instance import get_task_instance
from dateutil.parser import parse
from functools import partial
def _failure_callback(task_id, dag_id, execution_date, context):
if isinstance(context['exception'], AirflowSensorTimeout):
sensor_instance = context['task_instance']
next_execution_date = parse(context['ts']) + -(execution_date) + timedelta(hours=1)
ti = get_task_instance(dag_id=dag_id, task_id=task_id, execution_date=next_execution_date)
if ti.current_state() == 'success':
sensor_instance.set_state(State.SUCCESS)
sensor = ExternalTaskSensor(external_task_id='external_task_id',
task_id='sensor',
external_dag_id='external_dag_id',
execution_delta=timedelta(hours=-24) + timedelta(minutes=-30),
timeout=5,
on_failure_callback=partial(_failure_callback, 'external_task_id', 'external_dag_id', timedelta(hours=-24) + timedelta(minutes=-30)),
dag=dag)

python execute function every n seconds wait for completion

I want to execute FUNCTION periodically every 60 seconds, but I don't want to execute FUNCTION again IF the previous run has not completed yet. If the previous run completes in e.g. 120s then I want to execute a new FUNCTION call straight away. If previous run completed in e.g. 10s then I want to wait 50s before I execute a new FUNCTION call.
Please see my implementation below.
Can I achieve it with e.g. subprocess.run or some timeloop library so that the implementation would be much cleaner?
import time
def hello(x):
# some logic here
# execution could take any time between e.g. <10s, 120s>
def main(ii):
while True:
start = int(time.time())
try:
val = next(ii)
except StopIteration as ex:
return None
else:
hello(val)
run_time_diff = int(time.time()) - start
if run_time_diff < 60:
time.sleep(60 - run_time_diff)
ii = iter(list[[...],[...],...[...]])
main(ii=ii)
maybe apsheduler could help you. But if your job wil run more then waiting time, it could be skipped. In this case you can increase number of workers.
import datetime
import time
from apscheduler.schedulers.background import BackgroundScheduler
scheduler = BackgroundScheduler()
def some_job():
time.sleep(5)
print(f"{datetime.datetime.utcnow()}: Every 10 seconds")
job = scheduler.add_job(some_job, 'interval', seconds=10, max_instances=1)
scheduler.start()
try:
while True:
time.sleep(1)
finally:
scheduler.shutdown()

How to access return value from apscheduler?

I can't quite piece together how to access the return values from scheduled jobs in apscheduler. The job needs to run at a different time each day, and I need the return value from today's job to schedule tomorrow's job.
This link (how to get return value from apscheduler jobs) appears to be the best previous answer to this question. It suggests adding a listener to the scheduler. I've added a listener, but I'm not sure how to access it's return value. I can access the listeners attached to the scheduler, but I can't access their outputs. A listener, job_runs() in the code below, will print when a scheduled job runs.
Further, I know I need to access a JobExecutionEvent (https://apscheduler.readthedocs.io/en/latest/modules/events.html#module-apscheduler.events) which holds the return value from the function.
First, the function I want to access is run_all() where a bunch of operations are performed, but I just return True for the test case.
from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.schedulers.blocking import BlockingScheduler
from apscheduler.events import EVENT_JOB_EXECUTED, EVENT_JOB_ERROR, JobExecutionEvent
from datetime import datetime, timedelta
import logging
def run_all():
return True
def job_runs(event): # listener function
if event.exception:
print('The job did not run')
else:
print('The job completed # {}'.format(datetime.now()))
def job_return_val(event): # listener function
return event.retval
Then, I setup the scheduler, add the listeners, and add the job. The trigger is set to run the function 1 minute after the job is added to scheduler.
scheduler = BackgroundScheduler()
scheduler.add_listener(job_runs, EVENT_JOB_EXECUTED | EVENT_JOB_ERROR)
scheduler.add_listener(job_return_val, EVENT_JOB_EXECUTED)
cron_args = datetime_to_dict(datetime.now() + timedelta(minutes=1))
job = scheduler.add_job(run_all, "cron", **cron_args)
Next, I start the scheduler and print the scheduled job. Additionally, I setup logging so I know where the scheduler is.
test = scheduler.start()
scheduler.print_jobs()
logging.basicConfig()
logging.getLogger('apscheduler').setLevel(logging.DEBUG)
With the logging enabled, the scheduler reports that the job is run and removed from the scheduler, as I expect it to. job_runs() prints the correct output to the console. And with breakpoints, I know job_return_val() is called. However, I have no clue where the value it returns is sent to. The function appears to be called in a different thread called APScheduler. I don't know much about threads, but that makes sense. However, I do not understand when the output from that thread is returned to the main thread.
Finally, I've tried instantiating a JobExceptionEvent with the code, job_id, jobstore, and scheduled_run_time accessible from the attributes of scheduler and job, but the JobExceptionEvent does not seem to have any knowledge that the event was run in scheduler. That also seems to make sense due to the threading described in the preceding paragraph.
Any help in sorting through this would be great!
The return value of listener is not used anywhere (see the code), so there's no use to return any value anyway. If you need to schedule another job based on the value of previous job (acquired in the listener via the event object), you have to do it right in that listener.
EDIT: To illustrate how to do it (and prove it's possible), see this sample code:
from datetime import datetime
import time
from apscheduler.events import EVENT_JOB_ERROR, EVENT_JOB_EXECUTED
from apscheduler.schedulers.background import BackgroundScheduler
def tick():
print('Tick! The time is: %s' % datetime.now())
def tack():
print('Tack! The time is: %s' % datetime.now())
def listener(event):
if not event.exception:
job = scheduler.get_job(event.job_id)
if job.name == 'tick':
scheduler.add_job(tack)
if __name__ == '__main__':
scheduler = BackgroundScheduler()
scheduler.add_listener(listener, EVENT_JOB_EXECUTED | EVENT_JOB_ERROR)
scheduler.add_job(tick, 'interval', seconds=5)
scheduler.start()
try:
while True:
time.sleep(1)
except (KeyboardInterrupt, SystemExit):
scheduler.shutdown()
The output:
(venv) pasmen#nyx:~/tmp/x$ python test.py
Tick! The time is: 2019-04-03 19:51:29.192420
Tack! The time is: 2019-04-03 19:51:29.195878
Tick! The time is: 2019-04-03 19:51:34.193145
Tack! The time is: 2019-04-03 19:51:34.194898
Tick! The time is: 2019-04-03 19:51:39.193207
Tack! The time is: 2019-04-03 19:51:39.194868
Tick! The time is: 2019-04-03 19:51:44.193223
Tack! The time is: 2019-04-03 19:51:44.195066
...
You can use global variables for now. Here's an example:
from apscheduler.schedulers.blocking import BlockingScheduler
from apscheduler.triggers.cron import CronTrigger
def fn():
'''Increase `times` by one and print it.'''
global times
times += 1
print(times)
sched = BlockingScheduler()
times = 0
# Execute fn() each second.
sched.add_job(fn, trigger=CronTrigger(second='*/1'))
sched.start()
What you need would require the stateful jobs feature to be implemented.
I encountered the same problem a few days ago. My first solution was using keyword global, but not long before I realized it was problematic, because the variables defined outside the job could unexpectedly be changed, especially when they are local variables in a loop.
Then I thought about using listeners, too. But the callback passed into a listener takes only the event as a single argument, which means the only information you can get from the callback is the event itself, profoundly limiting what you could possibly do.
Finally I choose to pass a func to the task to be scheduled, which works fine to me. What you really have to do is just to use the return value of the scheduled task as an argument for the func, as the code below shows.
from apscheduler.schedulers.blocking import BlockingScheduler
from datetime import datetime
scheduler = BlockingScheduler()
def job_with_return(a:int, b:int, callback_return):
# You can do something heavier here with the inputs.
result = a + b
print(f'[in job] result is {result}')
if callback_return:
callback_return(result)
scheduler.add_job(func=job_with_return,
trigger='date',
args=(1, 2, lambda r: print(f'[out of job]: result is {r}')),
run_date=datetime.now(),
)
scheduler.start()
The output:
[in job] result is 3
[out of job]: result is 3
As to the request of OP,
The job needs to run at a different time each day, and I need the
return value from today's job to schedule tomorrow's job.
you can additionally feed the func with the scheduler as well as the time you want the task to run in tomorrow, so that you can arrange a schedule in the func.
from apscheduler.schedulers.blocking import BlockingScheduler
from datetime import datetime, timedelta
scheduler = BlockingScheduler()
today = datetime.now()
tomorrow = today + timedelta(seconds=1)
def task_today(a:int, b:int, scheduler:BlockingScheduler, callback_return, tomorrow:datetime):
# What we have to do today is to get the result and use it to schedule tomorrow's task.
result_today = a + b
print(f"[{datetime.now().strftime('%H:%M:%S')}] (Today) The result is {result_today}.")
scheduler.add_job(callback_return, 'date',
args=(result_today,),
run_date=tomorrow,
id='job_tomorrow')
def task_tomorrow(result_from_today:int):
result_tomrrow = result_from_today * 2
print(f"[{datetime.now().strftime('%H:%M:%S')}] (Tommorow) The result is {result_tomrrow}.")
scheduler.add_job(func=task_today,
trigger='date',
args=(1, 2, scheduler, task_tomorrow, tomorrow),
run_date=today,
id='job_today')
scheduler.start()
The output:
[22:22:40] (Today) The result is 3.
[22:22:41] (Tommorow) The result is 6.
You can even make a daily task with a recursion.
from apscheduler.schedulers.blocking import BlockingScheduler
from datetime import datetime, timedelta
scheduler = BlockingScheduler()
today = datetime.now()
def task_daily(result_yesterday:int, day_counter:int, scheduler:BlockingScheduler, callback_return):
# You can do something heavier here with more inputs.
result_today = result_yesterday + 2
day_counter += 1
tomorrow = datetime.now() + timedelta(seconds=1)
print(f"[{datetime.now().strftime('%H:%M:%S')}] (day {day_counter}) The result for today is {result_today}.")
scheduler.add_job(task_daily, 'date',
args=(result_today, day_counter, scheduler, callback_return),
run_date=tomorrow)
scheduler.add_job(func=task_daily,
trigger='date',
args=(0, 0, scheduler, task_daily),
run_date=today)
scheduler.start()
The output:
[22:43:17] (day 1) The result for today is 2.
[22:43:18] (day 2) The result for today is 4.
[22:43:19] (day 3) The result for today is 6.
[22:43:20] (day 4) The result for today is 8.
[22:43:21] (day 5) The result for today is 10.
[22:43:22] (day 6) The result for today is 12.
[22:43:23] (day 7) The result for today is 14.

Scheduling Python Script to run every hour accurately

Before I ask, Cron Jobs and Task Scheduler will be my last options, this script will be used across Windows and Linux and I'd prefer to have a coded out method of doing this than leaving this to the end user to complete.
Is there a library for Python that I can use to schedule tasks? I will need to run a function once every hour, however, over time if I run a script once every hour and use .sleep, "once every hour" will run at a different part of the hour from the previous day due to the delay inherent to executing/running the script and/or function.
What is the best way to schedule a function to run at a specific time of day (more than once) without using a Cron Job or scheduling it with Task Scheduler?
Or if this is not possible, I would like your input as well.
AP Scheduler fit my needs exactly.
Version < 3.0
import datetime
import time
from apscheduler.scheduler import Scheduler
# Start the scheduler
sched = Scheduler()
sched.daemonic = False
sched.start()
def job_function():
print("Hello World")
print(datetime.datetime.now())
time.sleep(20)
# Schedules job_function to be run once each minute
sched.add_cron_job(job_function, minute='0-59')
out:
>Hello World
>2014-03-28 09:44:00.016.492
>Hello World
>2014-03-28 09:45:00.0.14110
Version > 3.0
(From Animesh Pandey's answer below)
from apscheduler.schedulers.blocking import BlockingScheduler
sched = BlockingScheduler()
#sched.scheduled_job('interval', seconds=10)
def timed_job():
print('This job is run every 10 seconds.')
#sched.scheduled_job('cron', day_of_week='mon-fri', hour=10)
def scheduled_job():
print('This job is run every weekday at 10am.')
sched.configure(options_from_ini_file)
sched.start()
Maybe this can help: Advanced Python Scheduler
Here's a small piece of code from their documentation:
from apscheduler.schedulers.blocking import BlockingScheduler
def some_job():
print "Decorated job"
scheduler = BlockingScheduler()
scheduler.add_job(some_job, 'interval', hours=1)
scheduler.start()
To run something every 10 minutes past the hour.
from datetime import datetime, timedelta
while 1:
print 'Run something..'
dt = datetime.now() + timedelta(hours=1)
dt = dt.replace(minute=10)
while datetime.now() < dt:
time.sleep(1)
For apscheduler < 3.0, see Unknown's answer.
For apscheduler > 3.0
from apscheduler.schedulers.blocking import BlockingScheduler
sched = BlockingScheduler()
#sched.scheduled_job('interval', seconds=10)
def timed_job():
print('This job is run every 10 seconds.')
#sched.scheduled_job('cron', day_of_week='mon-fri', hour=10)
def scheduled_job():
print('This job is run every weekday at 10am.')
sched.configure(options_from_ini_file)
sched.start()
Update:
apscheduler documentation.
This for apscheduler-3.3.1 on Python 3.6.2.
"""
Following configurations are set for the scheduler:
- a MongoDBJobStore named “mongo”
- an SQLAlchemyJobStore named “default” (using SQLite)
- a ThreadPoolExecutor named “default”, with a worker count of 20
- a ProcessPoolExecutor named “processpool”, with a worker count of 5
- UTC as the scheduler’s timezone
- coalescing turned off for new jobs by default
- a default maximum instance limit of 3 for new jobs
"""
from pytz import utc
from apscheduler.schedulers.blocking import BlockingScheduler
from apscheduler.jobstores.sqlalchemy import SQLAlchemyJobStore
from apscheduler.executors.pool import ProcessPoolExecutor
"""
Method 1:
"""
jobstores = {
'mongo': {'type': 'mongodb'},
'default': SQLAlchemyJobStore(url='sqlite:///jobs.sqlite')
}
executors = {
'default': {'type': 'threadpool', 'max_workers': 20},
'processpool': ProcessPoolExecutor(max_workers=5)
}
job_defaults = {
'coalesce': False,
'max_instances': 3
}
"""
Method 2 (ini format):
"""
gconfig = {
'apscheduler.jobstores.mongo': {
'type': 'mongodb'
},
'apscheduler.jobstores.default': {
'type': 'sqlalchemy',
'url': 'sqlite:///jobs.sqlite'
},
'apscheduler.executors.default': {
'class': 'apscheduler.executors.pool:ThreadPoolExecutor',
'max_workers': '20'
},
'apscheduler.executors.processpool': {
'type': 'processpool',
'max_workers': '5'
},
'apscheduler.job_defaults.coalesce': 'false',
'apscheduler.job_defaults.max_instances': '3',
'apscheduler.timezone': 'UTC',
}
sched_method1 = BlockingScheduler() # uses overrides from Method1
sched_method2 = BlockingScheduler() # uses same overrides from Method2 but in an ini format
#sched_method1.scheduled_job('interval', seconds=10)
def timed_job():
print('This job is run every 10 seconds.')
#sched_method2.scheduled_job('cron', day_of_week='mon-fri', hour=10)
def scheduled_job():
print('This job is run every weekday at 10am.')
sched_method1.configure(jobstores=jobstores, executors=executors, job_defaults=job_defaults, timezone=utc)
sched_method1.start()
sched_method2.configure(gconfig=gconfig)
sched_method2.start()
the simplest option I can suggest is using the schedule library.
In your question, you said "I will need to run a function once every hour"
the code to do this is very simple:
import schedule
def thing_you_wanna_do():
...
...
return
schedule.every().hour.do(thing_you_wanna_do)
while True:
schedule.run_pending()
you also asked how to do something at a certain time of the day
some examples of how to do this are:
import schedule
def thing_you_wanna_do():
...
...
return
schedule.every().day.at("10:30").do(thing_you_wanna_do)
schedule.every().monday.do(thing_you_wanna_do)
schedule.every().wednesday.at("13:15").do(thing_you_wanna_do)
# If you would like some randomness / variation you could also do something like this
schedule.every(1).to(2).hours.do(thing_you_wanna_do)
while True:
schedule.run_pending()
90% of the code used is the example code of the schedule library. Happy scheduling!
Run the script every 15 minutes of the hour.
For example, you want to receive 15 minute stock price quotes, which are updated every 15 minutes.
while True:
print("Update data:", datetime.now())
sleep = 15 - datetime.now().minute % 15
if sleep == 15:
run_strategy()
time.sleep(sleep * 60)
else:
time.sleep(sleep * 60)
#For scheduling task execution
import schedule
import time
def job():
print("I'm working...")
schedule.every(1).minutes.do(job)
#schedule.every().hour.do(job)
#schedule.every().day.at("10:30").do(job)
#schedule.every(5).to(10).minutes.do(job)
#schedule.every().monday.do(job)
#schedule.every().wednesday.at("13:15").do(job)
#schedule.every().minute.at(":17").do(job)
while True:
schedule.run_pending()
time.sleep(1)
The Python standard library does provide sched and threading for this task. But this means your scheduler script will have be running all the time instead of leaving its execution to the OS, which may or may not be what you want.
On the version posted by sunshinekitty called "Version < 3.0" , you may need to specify apscheduler 2.1.2 . I accidentally had version 3 on my 2.7 install, so I went:
pip uninstall apscheduler
pip install apscheduler==2.1.2
It worked correctly after that. Hope that helps.
clock.py
from apscheduler.schedulers.blocking import BlockingScheduler
import pytz
sched = BlockingScheduler(timezone=pytz.timezone('Africa/Lagos'))
#sched.scheduled_job('cron', day_of_week='mon-sun', hour=22)
def scheduled_job():
print('This job is run every week at 10pm.')
#your job here
sched.start()
Procfile
clock: python clock.py
requirements.txt
APScheduler==3.0.0
After deployment, the final step is to scale up the clock process. This is a singleton process, meaning you’ll never need to scale up more than 1 of these processes. If you run two, the work will be duplicated.
$ heroku ps:scale clock=1
Source: https://devcenter.heroku.com/articles/clock-processes-python
Perhaps Rocketry suits your needs. It's a powerful scheduler that is very easy to use, has a lot of built-in scheduling options and it is easy to extend:
from rocketry import Rocketry
from rocketry.conds import daily, every, after_success
app = Rocketry()
#app.task(every("1 hour 30 minutes"))
def do_things():
...
#app.task(daily.between("12:00", "17:00"))
def do_daily_afternoon():
...
#app.task(daily & after_success(do_things))
def do_daily_after_task():
...
if __name__ == "__main__":
app.run()
It has much more though:
String based scheduling syntax
Logical statements (AND, OR, NOT)
A lot of built-in scheduling options
Easy to customize (custom conditions, parameters etc.)
Parallelization (run on separate thread or process)
Paramatrization (execution order and input-output)
Persistence: put the logs anywhere you like
Modify scheduler on runtime (ie. build API on top of it)
Links:
Documentation: https://rocketry.readthedocs.io/
Source code: https://github.com/Miksus/rocketry
Disclaimer: I'm the author
Probably you got the solution already #lukik, but if you wanna remove a scheduling, you should use:
job = scheduler.add_job(myfunc, 'interval', minutes=2)
job.remove()
or
scheduler.add_job(myfunc, 'interval', minutes=2, id='my_job_id')
scheduler.remove_job('my_job_id')
if you need to use a explicit job ID
For more information, you should check: https://apscheduler.readthedocs.io/en/stable/userguide.html#removing-jobs
I found that scheduler needs to run the program every second. If using a online server it would be costly.
So I have following:
It run at each minute at the 5th second, and you can change it to hours days by recalculating waiting period in seconds
import time
import datetime
Initiating = True
print(datetime.datetime.now())
while True:
if Initiating == True:
print("Initiate")
print( datetime.datetime.now())
time.sleep(60 - time.time() % 60+5)
Initiating = False
else:
time.sleep(60)
print("working")
print(datetime.datetime.now())
This method worked for me using relativedelta and datetime and a modulo boolean check for every hour.
It runs every hour from the time you start it.
import time
from datetime import datetime, timedelta
from dateutil.relativedelta import relativedelta
#Track next run outside loop and update the next run time within the loop
nxt_run=datetime.now()
#because while loops evaluate at microseconds we basically need to use a boolean evaluation to track when it should run next
while True:
cnow = datetime.now() #track the current time
time.sleep(1) #good to have so cpu doesn't spike
if (cnow.hour % 1 == 0 and cnow >= nxt_run):
print(f"start #{cnow}: next run #{nxt_run}")
nxt_run=cnow+relativedelta(hours=1) #add an hour to the next run
else:
print(f"next run #{nxt_run}")
One option is to write a C/C++ wrapper that executes the python script on a regular basis. Your end-user would run the C/C++ executable, which would remain running in the background, and periodically execute the python script. This may not be the best solution, and may not work if you don't know C/C++ or want to keep this 100% python. But it does seem like the most user-friendly approach, since people are used to clicking on executables. All of this assumes that python is installed on your end user's computer.
Another option is to use cron job/Task Scheduler but to put it in the installer as a script so your end user doesn't have to do it.

Categories