Sequentially run pending tasks with Python APS - python

Suppose I have two cron triggers:
trigger1 = CronTrigger(second='0,20,40')
trigger2 = CronTrigger(second='0,10,20,30,40,50')
and I create my scheduler like this:
scheduler = BlockingScheduler()
scheduler.add_job(lambda: method1(param1, param2), trigger=trigger1)
scheduler.add_job(lambda: method2(param1, param3), trigger=trigger2)
with these two methods which do work:
def method1(s, t):
print("doing work in method1")
time.sleep(2)
print("doing work in method1")
time.sleep(2)
print("doing work in method1")
time.sleep(2)
def method2(s, t):
print("doing work in method2")
time.sleep(2)
print("doing work in method2")
time.sleep(2)
print("doing work in method2")
time.sleep(2)
When the scheduled times overlap(eg 0, 20, 30) and the scheduler has two jobs scheduled for that time, it seems to run them in parellel. The output looks like this:
doing work in method1
doing work in method2
doing work in method1
doing work in method2
doing work in method1
doing work in method2
Question is: How do I set it up so that the pending jobs are run sequentially. ie. if the times of two jobs overlap, run the first job until completion, then run the second one.
Edit: The reason I have used the apsschedule library is because I need cron-like functionality. I need the process to run between certain times of the day at certain intervals.

Use DebugExecutor.
For example:
from apscheduler.schedulers.blocking import BlockingScheduler
from apscheduler.executors.debug import DebugExecutor
def foo1():
print("x")
def foo2():
time.sleep(3)
print("y")
scheduler = BlockingScheduler()
scheduler.add_executor(DebugExecutor(), "consecutive")
scheduler.add_job(foo1, 'interval', max_instances=1, seconds=1, executor="consecutive")
scheduler.add_job(foo2, 'interval', max_instances=1, seconds=5, executor="consecutive")

Use DebugExecutor is a good idea, additionally I needed to specify a high value for the misfire_grace_time parameter in .add_job() to avoid skipping runs when multiple jobs have same execution interval

Related

Python concurrency first result ends waiting for not yet done results

What I wish to do is Move on... after the first True, not caring about not yet finished I/O bound tasks. In below case two() is first and only True so the program needs to execute like this:
Second
Move on..
NOT:
Second
First
Third
Move on...
import concurrent.futures
import time
def one():
time.sleep(2)
print('First')
return False
def two():
time.sleep(1)
print('Second')
return True
def three():
time.sleep(4)
print('Third')
return False
tasks = [one, two, three]
with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
for t in range(len(tasks)):
executor.submit(tasks[t])
print('Move on...')
A with statement is not what you want here because it waits for all submitted jobs to finish. You need to submit the tasks, as you already do, but then call as_completed to wait for the first task that returns true (and no longer):
executor = concurrent.futures.ThreadPoolExecutor()
futures = [executor.submit(t) for t in tasks]
for f in concurrent.futures.as_completed(futures):
if f.result():
break
print('Move on...')
The problem with concurrent.futures.ThreadPoolExecutor is that once tasks are submitted, they will run to completion so the program will print 'Move on...' but if there is in fact nothing else to do, the program will not terminate until functions one and three terminate and (and print their messages). So the program is guaranteed to run for at least 4 seconds.
Better to use the ThreadPool class in the multiprocessing.pool module which supports a terminate method that will kill all outstanding tasks. The closest thing to an as_completed method would probably be using the imap_unordered method, but that requires a single worker function being used for all 3 tasks. But we can use apply_async specifying a callback function to be invoked as results become available:
from multiprocessing.pool import ThreadPool
import time
from threading import Event
def one():
time.sleep(2)
print('First')
return False
def two():
time.sleep(1)
print('Second')
return True
def three():
time.sleep(4)
print('Third')
return False
def my_callback(result):
if result:
executor.terminate() # kill all other tasks
done_event.set()
tasks = [one, two, three]
executor = ThreadPool(3)
done_event = Event()
for t in tasks:
executor.apply_async(t, callback=my_callback)
done_event.wait()
print("Moving on ...")

Python wait for x seconds without sleeping program? [duplicate]

I'm trying to run 2 functions at the same time.
def func1():
print('Working')
def func2():
print('Working')
func1()
func2()
Does anyone know how to do this?
Do this:
from threading import Thread
def func1():
print('Working')
def func2():
print("Working")
if __name__ == '__main__':
Thread(target = func1).start()
Thread(target = func2).start()
The answer about threading is good, but you need to be a bit more specific about what you want to do.
If you have two functions that both use a lot of CPU, threading (in CPython) will probably get you nowhere. Then you might want to have a look at the multiprocessing module or possibly you might want to use jython/IronPython.
If CPU-bound performance is the reason, you could even implement things in (non-threaded) C and get a much bigger speedup than doing two parallel things in python.
Without more information, it isn't easy to come up with a good answer.
This can be done elegantly with Ray, a system that allows you to easily parallelize and distribute your Python code.
To parallelize your example, you'd need to define your functions with the #ray.remote decorator, and then invoke them with .remote.
import ray
ray.init()
# Define functions you want to execute in parallel using
# the ray.remote decorator.
#ray.remote
def func1():
print("Working")
#ray.remote
def func2():
print("Working")
# Execute func1 and func2 in parallel.
ray.get([func1.remote(), func2.remote()])
If func1() and func2() return results, you need to rewrite the above code a bit, by replacing ray.get([func1.remote(), func2.remote()]) with:
ret_id1 = func1.remote()
ret_id2 = func1.remote()
ret1, ret2 = ray.get([ret_id1, ret_id2])
There are a number of advantages of using Ray over the multiprocessing module or using multithreading. In particular, the same code will run on a single machine as well as on a cluster of machines.
For more advantages of Ray see this related post.
One option, that looks like it makes two functions run at the same
time, is using the threading module (example in this answer).
However, it has a small delay, as an Official Python Documentation
page describes. A better module to try using is multiprocessing.
Also, there's other Python modules that can be used for asynchronous execution (two pieces of code working at the same time). For some information about them and help to choose one, you can read this Stack Overflow question.
Comment from another user about the threading module
He might want to know that because of the Global Interpreter Lock
they will not execute at the exact same time even if the machine in
question has multiple CPUs. wiki.python.org/moin/GlobalInterpreterLock
– Jonas Elfström Jun 2 '10 at 11:39
Quote from the Documentation about threading module not working
CPython implementation detail: In CPython, due to the Global Interpreter
Lock, only one thread can execute Python code at once (even though
certain performance-oriented libraries might overcome this limitation).
If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing or concurrent.futures.ProcessPoolExecutor.
However, threading is still an appropriate model if you
want to run multiple I/O-bound tasks simultaneously.
The thread module does work simultaneously unlike multiprocess, but the timing is a bit off. The code below prints a "1" and a "2". These are called by different functions respectively. I did notice that when printed to the console, they would have slightly different timings.
from threading import Thread
def one():
while(1 == num):
print("1")
time.sleep(2)
def two():
while(1 == num):
print("2")
time.sleep(2)
p1 = Thread(target = one)
p2 = Thread(target = two)
p1.start()
p2.start()
Output: (Note the space is for the wait in between printing)
1
2
2
1
12
21
12
1
2
Not sure if there is a way to correct this, or if it matters at all. Just something I noticed.
Try this
from threading import Thread
def fun1():
print("Working1")
def fun2():
print("Working2")
t1 = Thread(target=fun1)
t2 = Thread(target=fun2)
t1.start()
t2.start()
In case you also want to wait until both functions have been completed:
from threading import Thread
def func1():
print 'Working'
def func2():
print 'Working'
# Define the threads and put them in an array
threads = [
Thread(target = self.func1),
Thread(target = self.func2)
]
# Func1 and Func2 run in separate threads
for thread in threads:
thread.start()
# Wait until both Func1 and Func2 have finished
for thread in threads:
thread.join()
Another approach to run multiple functions concurrently in python is using asyncio that I couldn't see within the answers.
import asyncio
async def func1():
for _ in range(5):
print(func1.__name__)
await asyncio.sleep(0) # switches tasks every iteration.
async def func2():
for _ in range(5):
print(func2.__name__)
await asyncio.sleep(0)
tasks = [func1(), func2()]
await asyncio.gather(*tasks)
Out:
func1
func2
func1
func2
func1
func2
func1
func2
func1
func2
[NOTE]:
The above asyncio syntax is valid on python 3.7 and later
multiprocessing vs multithreading vs asyncio
This code below can run 2 functions parallelly:
from multiprocessing import Process
def test1():
print("Test1")
def test2():
print("Test2")
if __name__ == "__main__":
process1 = Process(target=test1)
process2 = Process(target=test2)
process1.start()
process2.start()
process1.join()
process2.join()
Result:
Test1
Test2
And, these 2 sets of code below can run 2 functions concurrently:
from threading import Thread
def test1():
print("Test1")
def test2():
print("Test2")
thread1 = Thread(target=test1)
thread2 = Thread(target=test2)
thread1.start()
thread2.start()
thread1.join()
thread2.join()
from operator import methodcaller
from multiprocessing.pool import ThreadPool
def test1():
print("Test1")
def test2():
print("Test2")
caller = methodcaller("__call__")
ThreadPool().map(caller, [test1, test2])
Result:
Test1
Test2
And, this code below can run 2 async functions concurrently and asynchronously:
import asyncio
async def test1():
print("Test1")
async def test2():
print("Test2")
async def call_tests():
await asyncio.gather(test1(), test2())
asyncio.run(call_tests())
Result:
Test1
Test2
I think what you are trying to convey can be achieved through multiprocessing. However if you want to do it through threads you can do this.
This might help
from threading import Thread
import time
def func1():
print 'Working'
time.sleep(2)
def func2():
print 'Working'
time.sleep(2)
th = Thread(target=func1)
th.start()
th1=Thread(target=func2)
th1.start()
test using APscheduler:
from apscheduler.schedulers.background import BackgroundScheduler
import datetime
dt = datetime.datetime
Future = dt.now() + datetime.timedelta(milliseconds=2550) # 2.55 seconds from now testing start accuracy
def myjob1():
print('started job 1: ' + str(dt.now())[:-3]) # timed to millisecond because thats where it varies
time.sleep(5)
print('job 1 half at: ' + str(dt.now())[:-3])
time.sleep(5)
print('job 1 done at: ' + str(dt.now())[:-3])
def myjob2():
print('started job 2: ' + str(dt.now())[:-3])
time.sleep(5)
print('job 2 half at: ' + str(dt.now())[:-3])
time.sleep(5)
print('job 2 done at: ' + str(dt.now())[:-3])
print(' current time: ' + str(dt.now())[:-3])
print(' do job 1 at: ' + str(Future)[:-3] + '''
do job 2 at: ''' + str(Future)[:-3])
sched.add_job(myjob1, 'date', run_date=Future)
sched.add_job(myjob2, 'date', run_date=Future)
i got these results. which proves they are running at the same time.
current time: 2020-12-15 01:54:26.526
do job 1 at: 2020-12-15 01:54:29.072 # i figure these both say .072 because its 1 line of print code
do job 2 at: 2020-12-15 01:54:29.072
started job 2: 2020-12-15 01:54:29.075 # notice job 2 started before job 1, but code calls job 1 first.
started job 1: 2020-12-15 01:54:29.076
job 2 half at: 2020-12-15 01:54:34.077 # halfway point on each job completed same time accurate to the millisecond
job 1 half at: 2020-12-15 01:54:34.077
job 1 done at: 2020-12-15 01:54:39.078 # job 1 finished first. making it .004 seconds faster.
job 2 done at: 2020-12-15 01:54:39.091 # job 2 was .002 seconds faster the second test
I might be wrong but:
with this piece of code:
def function_sleep():
time.sleep(5)
start_time = time.time()
p1=Process(target=function_sleep)
p2=Process(target=function_sleep)
p1.start()
p2.start()
end_time = time.time()
I took the time and I would expect to get 5/6 seconds, while it always takes the double of the argument passed to the function sleep (10 seconds in this case).
What's the matter?
Sorry guys, as mentioned in the previous comment, the "join()" need to be called.
That's very important!

Python, schedule parallel Threads with one thread for each method

How I can run those two tasks in parallel, but if the Thread with the name of the method was not finished yet just skip this method till the next schedule iteration?
Because now it creates a new thread for the same method while it is running.
def task1:
#do task1
def task1:
#do task2
def run_threaded(job_fn):
job_thread = threading.Thread(target=job_fn)
job_thread.start()
schedule.every(5).minutes.do(run_threaded, task1)
schedule.every(3).minutes.do(run_threaded, task2)
while True:
schedule.run_pending()
time.sleep(1)
Figured out with another module called apscheduler.
It has parameter max_instances:1 and log thing like this
*Execution of job "task1 (trigger: interval[0:50:0], next run at: 2019-02-16 11:38:23 EET)" skipped: maximum number of running instances reached (1)*
scheduler = BackgroundScheduler(executors=executors, job_defaults=job_defaults)
scheduler.add_job(task1, 'interval', minutes=5)
scheduler.add_job(task2, 'interval', minutes=7)
scheduler.start()
You don't need to create a threading.Thread because module doing this for you. Just pass your method.

How schedule a job (Django, Python)

I would like to create a job that rolls to all 10 munites.
I find a good example here. The problem is that the program is freezing during the waiting time and my other urls are blocked.
after me it's because of while True:
Is there a way to do it without going around this problem?
voici le code:
import schedule
import time
def job():
print("I'm working...")
schedule.every(10).minutes.do(job)
while True:
schedule.run_pending()
time.sleep(1)
*******************************************************************.
I found the right way to do it. Here is the link:
For that to work well, I removed this part:
# time.sleep(20)
# print('Checkpoint **************************')
# time.sleep(30)
# print('Bye -----------------------')
Here is the code that works:
import threading
class ThreadingExample(object):
""" Threading example class
The run() method will be started and it will run in the background
until the application exits.
"""
def __init__(self, interval=10):
""" Constructor
:type interval: int
:param interval: Check interval, in seconds
"""
self.interval = interval
thread = threading.Thread(target=self.run, args=())
thread.daemon = True # Daemonize thread
thread.start() # Start the execution
def run(self):
""" Method that runs forever """
while True:
# Do something
print('Doing something imporant in the background', self.interval)
pk_info_semaine = job_temp.objects.all()
for a in pk_info_semaine:
print('num_semaine:',a.num_semaine,'user_id:',a.user_id)
time.sleep(self.interval)
example = ThreadingExample()
Thank you all and thank you to the author: Paris Nakita Kejser Here
You can use celery + celerybeat together with Django to run scheduled tasks. You can write your method as a celery task, and add an entry in your settings.py file to make the task run every 10 minutes. The task will run in its on thread, hence not blocking your application.
voici le link to celery:
http://docs.celeryproject.org/en/latest/django/first-steps-with-django.html

Python - Apscheduler not stopping a job even after using 'remove_job'

This is my code
I'm using the remove_job and the shutdown functions of the scheduler to stop a job, but it keeps on executing.
What is the correct way to stop a job from executing any further?
from apscheduler.schedulers.background import BlockingScheduler
def job_function():
print "job executing"
scheduler = BlockingScheduler(standalone=True)
scheduler.add_job(job_function, 'interval', seconds=1, id='my_job_id')
scheduler.start()
scheduler.remove_job('my_job_id')
scheduler.shutdown()
Simply ask the scheduler to remove the job inside the job_function using the remove_function as #Akshay Pratap Singh Pointed out correctly, that the control never returns back to start()
from apscheduler.schedulers.background import BlockingScheduler
count = 0
def job_function():
print "job executing"
global count, scheduler
# Execute the job till the count of 5
count = count + 1
if count == 5:
scheduler.remove_job('my_job_id')
scheduler = BlockingScheduler()
scheduler.add_job(job_function, 'interval', seconds=1, id='my_job_id')
scheduler.start()
As you are using BlockingScheduler , so first you know it's nature.
So, basically BlockingScheduler is a scheduler which runs in foreground(i.e start() will block the program).In laymen terms, It runs in the foreground, so when you call start(), the call never returns. That's why all lines which are followed by start() are never called, due to which your scheduler never stopped.
BlockingScheduler can be useful if you want to use APScheduler as a standalone scheduler (e.g. to build a daemon).
Solution
If you want to stop your scheduler after running some code, then you should opt for other types of scheduler listed in ApScheduler docs.
I recommend BackgroundScheduler, if you want the scheduler to run in the background inside your application/program which you can pause, resume and remove at anytime, when you need it.
The scheduler needs to be stopped from another thread. The thread in which scheduler.start() is called gets blocked by the scheduler. The lines that you've written after scheduler.start() is unreachable code.
This is how I solved the problem. Pay attention to the position where the code schedule.shutdown() is located!
def do_something():
global schedule
print("schedule execute")
# schedule.remove_job(id='rebate')
schedule.shutdown(wait=False)
if __name__ == '__main__':
global schedule
schedule = BlockingScheduler()
schedule.add_job(do_something, 'cron', id='rebate', month=12, day=5, hour=17, minute=47, second=35)
schedule.start()
print('over')

Categories