Running apschduler in Python script as a daemon? - python

I have a job.py which has the following code.
import datetime
import logging
import sys
import os
from apscheduler.scheduler import Scheduler
from src.extractors.pExtractor import somejob
def run_job():
start = datetime.datetime.now()
logging.debug('Proposal extraction job starting')
somejob.main()
end = datetime.datetime.now()
duration = end - start
logging.debug('job completed , took ' + str(duration.seconds) + ' seconds')
def main():
logging.basicConfig(filename='/tmp/pExtractor.log', level=logging.DEBUG,format='%(levelname)s[%(asctime)s]: %(message)s')
sched = Scheduler()
sched.start()
sched.add_interval_job(run_job, minutes=2)
if __name__ == '__main__':
main()
When I run this on the command prompt, it exits immediately:
INFO[2012-04-03 13:31:02,825]: Started thread pool with 0 core threads
and 20 maximum threads INFO[2012-04-03 13:31:02,827]: Scheduler
started INFO[2012-04-03 13:31:02,827]: Added job "run_job (trigger:
cron[minute='2'], next run at: 2012-04-03 14:02:00)" to job store
"default" INFO[2012-04-03 13:31:02,828]: Shutting down thread pool
How can I makde this run as a daemon?

Write your main() as below.
def main():
[... your_code_as_in_your_question ...]
while (True):
pass
Additionally it shouldn't hurt to consider PEP 3143.

Related

Multiprocess web application using API or MessageQueue

I'm testing multiprocessing using apply_async.
However, it looks like each apply_async is called from MainProcess and it's not exactly asynchronous. Each function is called only after previous one is finished. I'm not sure what I'm missing here.
I'm using Windows with Python 3.8, so it's using the spawn method to create processes.
import os
import time
from multiprocessing import Pool, cpu_count, current_process
from threading import current_thread
def go_to_sleep():
pid = os.getpid()
thread_name = current_thread().name
process_name = current_process().name
print(f"{pid} Process {process_name} and {thread_name} going to sleep")
time.sleep(5)
def apply_async():
pool = Pool(processes=cpu_count())
print(f"Number of procesess {len(pool._pool)}")
for i in range(20):
pool.apply_async(go_to_sleep())
pool.close()
pool.join()
def main():
apply_async()
if __name__ == "__main__":
start_time = time.perf_counter()
main()
end_time = time.perf_counter()
print(f"Elapsed run time: {end_time - start_time} seconds.")
Output:
Number of procesess 8
26776 Process MainProcess and MainThread going to sleep
26776 Process MainProcess and MainThread going to sleep
26776 Process MainProcess and MainThread going to sleep
The problem is that your code is not actually calling the specified function in the process pool, it is calling it in the main thread, and passing the result of calling it to pool.apply_async.
That is, instead of calling pool.apply_async(go_to_sleep()), you should call pool.apply_async(go_to_sleep). You need to pass the function that should be called to Pool.apply_async - you should not call the function when you call Pool.apply_async.

Multiple Threads or Workers in Python? - Want to increase Performance

Currently, I got some little Python Script running, creating some Web-Requests.
I am absolute new to Python, so I took a bare-bones Script I found, and it uses Multi-Threads (see end of thread for the full Script):
if __name__ == '__main__':
threads = []
for i in range(THREAD_COUNT):
t = Thread(target=callback)
threads.append(t)
t.start()
for t in threads:
t.join()
However, I feel this Script is kinda slow, like it does the Requests after each other and not at the same time.
So I took another approach and tried to find more about Workers and Multi-Threads.
It seems "Workers" are the Way to go, instead of Threads?
So I took the following from a Tutorial and modified it a little:
import logging
import os
from queue import Queue
from threading import Thread
from time import time
from multi import callback
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)
class DownloadWorker(Thread):
def __init__(self, queue):
Thread.__init__(self)
self.queue = queue
def run(self):
while True:
# that is my Function in Multi.py (A simple Web Request Function)
try:
callback()
finally:
self.queue.task_done()
if __name__ == '__main__':
ts = time()
queue = Queue()
for x in range(8):
worker = DownloadWorker(queue)
worker.daemon = True
worker.start()
# I put that here, because I want to run my "Program" infinite times
for i in range(500000):
logger.info('Queueing')
queue.put(i)
queue.join()
logging.info('Took %s', time() - ts)
I am not sure here, if that is the correct approach, from my Understanding I created 8 Workers and with the queue.put(i). I give them Jobs (500,000 in this Case?) passing them the current counter (which does nothing, it seems to be required tho?)
After he is done queening, the Function is executed, as I can see in my Console.
However, I feel it still runs same slow as before?
(My Original Request File)
from threading import Thread
import requests
import json
import string
import urllib3
import threading
THREAD_COUNT = 5
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
def callback():
counter = 0
try:
while True:
print("Prozess " + str(threading.get_ident())+ " " +str(counter))
counter = counter + 1
response = requests.post('ourAPIHere',verify=False, json={"pingme":"hello"})
json_data = json.loads(response.text)
if json_data["status"] == "error":
print("Server Error? Check logs!")
if json_data["status"] == "success":
print("OK")
except KeyboardInterrupt:
return
if __name__ == '__main__':
threads = []
for i in range(THREAD_COUNT):
t = Thread(target=callback)
threads.append(t)
t.start()
for t in threads:
t.join()

Python Schedule not work in Flask

I am importing Schedule into Flask. My project contains WSGI however I know little about the relationship between Flask and WSGI. Now I have three main files:
wsgi.py: Automatically generated by other tool.
app.py : I put client request here.
test.py: Used to test Schedule.
I want to start a task which is a long task when server launch. Here is the part of wsgi.py:
# -*- coding: utf-8 -*-
from threading import Thread
import test
t = Thread(target=test.job)
t.start()
if __name__ == '__main__':
...
As you see I start a thread and let the job work in it.Here is my test.py.
import schedule
def job():
schedule.every(1).seconds.do(pr)
def pr():
print("I'm working...")
My problem is that the job never starts.
I find out my problem.I never let schedule execute jobs. Now wsgi.py looks like this.
# -*- coding: utf-8 -*-
from threading import Thread
import test
schedule.every(1).seconds.do(test.job)
t = Thread(target=test.run_schedule)
t.start()
if __name__ == '__main__':
...
And test.py:
import schedule
import time
start_time = time.time()
def job():
print("I'm working..." + str(time.time() - start_time))
def run_schedule():
while True:
schedule.run_pending()
time.sleep(1)
In order to work in separate thread, I create a thread and in this thread I loop every 1ms. In loop, schedule invoke run_pending to call the job if time out (in my case it's 1s).

Why doesn't this work? Is this a apscheduler bug?

When I run this it waits a minute then it prints 'Lights on' then waits two minutes and prints 'Lights off'. After that apscheduler seems to go nuts and quickly alternates between the two very fast.
Did i just stumble into a apscheduler bug or why does this happen?
from datetime import datetime, timedelta
import time
import os, signal, logging
logging.basicConfig(level=logging.DEBUG)
from apscheduler.schedulers.background import BackgroundScheduler
scheduler = BackgroundScheduler()
def turn_on():
#Turn ON
print('##############################Lights on')
def turn_off():
#Turn off
print('#############################Lights off')
def schedule():
print('Lights will turn on at'.format(lights_on_time))
if __name__ == '__main__':
while True:
lights_on_time = (str(datetime.now() + timedelta(minutes=1)))
lights_off_time = (str(datetime.now() + timedelta(minutes=2)))
scheduler.add_job(turn_on, 'date', run_date=lights_on_time)
scheduler.add_job(turn_off, 'date', run_date=lights_off_time)
try:
scheduler.start()
signal.pause()
except:
pass
print('Press Ctrl+{0} to exit'.format('Break' if os.name == 'nt' else 'C'))
try:
# This is here to simulate application activity (which keeps the main thread alive).
while True:
time.sleep(2)
except (KeyboardInterrupt, SystemExit):
# Not strictly necessary if daemonic mode is enabled but should be done if possible
scheduler.shutdown()
You are flooding the scheduler with events. You are using the BackgroundScheduler, meaning that scheduler.start() is exiting and not waiting for the event to happen. The simplest fix may be to not use the BackgroundScheduler (use the BlockingScheduler), or put a sleep(180) on your loop.
Try this:
from datetime import datetime, timedelta
from apscheduler.schedulers.background import BackgroundScheduler
import time
scheduler = BackgroundScheduler()
def turn_on():
print('Turn on', datetime.now())
def turn_off():
print('Turn off', datetime.now())
scheduler.start()
while True:
scheduler.add_job(func=turn_on, trigger='date', next_run_time=datetime.now() + timedelta(minutes=1))
scheduler.add_job(func=turn_off, trigger='date', next_run_time=datetime.now() + timedelta(minutes=2))
time.sleep(180)
You should only start the scheduler once.

APscheduler will not stop

I have python code that I am developing for a website that, among other things, creates an excel sheet and then converts it into a json file. I need for this code to run continuously unless it is killed by the website administrator.
To this end, I am using APscheduler.
The code runs perfectly without APscheduler but when I attempt to add the rest of the code one of two things happens; 1) It runs forever and will not stop despite using "ctrl+C" and I need to stop it using task manager or 2) It only runs once, and then it stops
Code That doesn't Stop:
from apscheduler.scheduler import Scheduler
import logging
import time
logging.basicConfig()
sched = Scheduler()
sched.start()
(...)
code to make excel sheet and json file
(...)
#sched.interval_schedule(seconds = 15)
def job():
excelapi_final()
while True:
time.sleep(10)
sched.shutdown(wait=False)
Code that stops running after one time:
from apscheduler.scheduler import Scheduler
import logging
import time
logging.basicConfig()
sched = Scheduler()
(...)
#create excel sheet and json file
(...)
#sched.interval_schedule(seconds = 15)
def job():
excelapi_final()
sched.start()
while True:
time.sleep(10)
sched.shutdown(wait=False)
I understand from other questions, a few tutorials and the documentation that sched.shutdown should allow for the code to be killed by ctrl+C - however that is not working. Any ideas? Thanks in advance!
You could use the standalone mode:
sched = Scheduler(standalone=True)
and then start the scheduler like this:
try:
sched.start()
except (KeyboardInterrupt):
logger.debug('Got SIGTERM! Terminating...')
Your corrected code should look like this:
from apscheduler.scheduler import Scheduler
import logging
import time
logging.basicConfig()
sched = Scheduler(standalone=True)
(...)
code to make excel sheet and json file
(...)
#sched.interval_schedule(seconds = 15)
def job():
excelapi_final()
try:
sched.start()
except (KeyboardInterrupt):
logger.debug('Got SIGTERM! Terminating...')
This way the program will stop when Ctrl-C is pressed
You can gracefully shut it down:
import signal
from apscheduler.scheduler import Scheduler
import logging
import time
logging.basicConfig()
sched = Scheduler()
(...)
#create excel sheet and json file
(...)
#sched.interval_schedule(seconds = 15)
def job():
excelapi_final()
sched.start()
def gracefully_exit(signum, frame):
print('Stopping...')
sched.shutdown()
signal.signal(signal.SIGINT, gracefully_exit)
signal.signal(signal.SIGTERM, gracefully_exit)

Categories