Python understanding the scheduler

Python understanding the scheduler - python

I've a python scheduler code to print Hello and World!.
import sched
import time
def x():
print "Hello"
s = sched.scheduler(time.time, time.sleep)
s.enter(10, 1, x, ())
s.run()
print "World!"
This waits for 10 seconds and outputs:
Hello
World!
I think a scheduler's job is to schedule a task without interrupting the current process. But here it's putting the whole program to sleep and behaves just like the below code:
import time
def x():
print "Hello"
time.sleep(10)
x()
print "World!"
I guess the scheduler makes the program to sleep due to the time.sleep parameter in sched.scheduler(time.time, time.sleep).
Is there anyway we can make it work just like a real-time scheduler without blocking the main process without using any multithreading or multiprocessing?

From the docs:
In multi-threaded environments, the scheduler class has limitations with respect to thread-safety, inability to insert a new task before the one currently pending in a running scheduler, and holding up the main thread until the event queue is empty. Instead, the preferred approach is to use the threading.Timer class instead.
from threading import Timer
def x():
print "Hello"
Timer(10, x, ()).start()
print "World!"
without blocking the main process without using any multithreading or multiprocessing
A single thread can't be doing two things at the same time, so... threading is the absolute minimum you need to use in order not to block.

The scheduler's job is to not only execute the subsequent tasks after every particular time period. its job is also to lock the object until it completes its task.
from threading import Timer
def xy():
print("End")
Timer(4, xy, ()).start()
print ("Start")
Initially, Start will be printed, then 4 sec later End will be printed.
What if we have an endless loop
from threading import Timer
def xy():
print("End")
Timer(4, xy, ()).start()
print("Start")
i = 0
while i < 10:
print(i)
It will be running continuously. Hence, It is proved that it will hold the thread until the prev one completes its task.

Related

How to avoid to start hundreds of threads when starting (very short) actions at different timings in the future

I use this method to launch a few dozen (less than thousand) of calls of do_it at different timings in the future:
import threading
timers = []
while True:
for i in range(20):
t = threading.Timer(i * 0.010, do_it, [i]) # I pass the parameter i to function do_it
t.start()
timers.append(t) # so that they can be cancelled if needed
wait_for_something_else() # this can last from 5 ms to 20 seconds
The runtime of each do_it call is very fast (much less than 0.1 ms) and non-blocking. I would like to avoid spawning hundreds of new threads for such a simple task.
How could I do this with only one additional thread for all do_it calls?
Is there a simple way to do this with Python, without third party library and only standard library?

As I understand it, you want a single worker thread that can process submitted tasks, not in the order they are submitted, but rather in some prioritized order. This seems like a job for the thread-safe queue.PriorityQueue.
from dataclasses import dataclass, field
from threading import Thread
from typing import Any
from queue import PriorityQueue
#dataclass(order=True)
class PrioritizedItem:
priority: int
item: Any=field(compare=False)
def thread_worker(q: PriorityQueue[PrioritizedItem]):
while True:
do_it(q.get().item)
q.task_done()
q = PriorityQueue()
t = Thread(target=thread_worker, args=(q,))
t.start()
while True:
for i in range(20):
q.put(PrioritizedItem(priority=i * 0.010, item=i))
wait_for_something_else()
This code assumes you want to run forever. If not, you can add a timeout to the q.get in thread_worker, and return when the queue.Empty exception is thrown because the timeout expired. Like that you'll be able to join the queue/thread after all the jobs have been processed, and the timeout has expired.
If you want to wait until some specific time in the future to run the tasks, it gets a bit more complicated. Here's an approach that extends the above approach by sleeping in the worker thread until the specified time has arrived, but be aware that time.sleep is only as accurate as your OS allows it to be.
from dataclasses import astuple, dataclass, field
from datetime import datetime, timedelta
from time import sleep
from threading import Thread
from typing import Any
from queue import PriorityQueue
#dataclass(order=True)
class TimedItem:
when: datetime
item: Any=field(compare=False)
def thread_worker(q: PriorityQueue[TimedItem]):
while True:
when, item = astuple(q.get())
sleep_time = (when - datetime.now()).total_seconds()
if sleep_time > 0:
sleep(sleep_time)
do_it(item)
q.task_done()
q = PriorityQueue()
t = Thread(target=thread_worker, args=(q,))
t.start()
while True:
now = datetime.now()
for i in range(20):
q.put(TimedItem(when=now + timedelta(seconds=i * 0.010), item=i))
wait_for_something_else()
To address this problem using only a single extra thread we have to sleep in that thread, so it's possible that new tasks with higher priority could come in while the worker is sleeping. In that case the worker would process that new high priority task after it's done with the current one. The above code assumes that scenario will not happen, which seems reasonable based on the problem description. If that might happen you can alter the sleep code to repeatedly poll if the task at the front of the priority queue has come due. The disadvantage with a polling approach like that is that it would be more CPU intensive.
Also, if you can guarantee that the relative order of the tasks won't change after they've been submitted to the worker, then you can replace the priority queue with a regular queue.Queue to simplify the code somewhat.
These do_it tasks can be cancelled by removing them from the queue.
The above code was tested with the following mock definitions:
def do_it(x):
print(x)
def wait_for_something_else():
sleep(5)
An alternative approach that uses no extra threads would be to use asyncio, as pointed out by smcjones. Here's an approach using asyncio that calls do_it at specific times in the future by using loop.call_later:
import asyncio
def do_it(x):
print(x)
async def wait_for_something_else():
await asyncio.sleep(5)
async def main():
loop = asyncio.get_event_loop()
while True:
for i in range(20):
loop.call_later(i * 0.010, do_it, i)
await wait_for_something_else()
asyncio.run(main())
These do_it tasks can be cancelled using the handle returned by loop.call_later.
This approach will, however, require either switching over your program to use asyncio throughout, or running the asyncio event loop in a separate thread.

It sounds like you want something to be non-blocking and asynchronous, but also single-processed and single-threaded (one thread dedicated to do_it).
If this is the case, and especially if any networking is involved, so long as you're not actively doing serious I/O on your main thread, it is probably worthwhile using asyncio instead.
It's designed to handle non-blocking operations, and allows you to make all of your requests without waiting for a response.
Example:
import asyncio
def main():
while True:
tasks = []
for i in range(20):
tasks.append(asyncio.create_task(do_it(i)))
await wait_for_something_else()
for task in tasks:
await task
asyncio.run(main())
Given the time spent on blocking I/O (seconds) - you'll probably waste more time managing threads than you will save on generating a separate thread to do these other operations.

As you have said that in your code each series of 20 do_it calls starts when wait_for_something_else is finished, I would recommend calling the join method in each iteration of the while loop:
import threading
timers = []
while True:
for i in range(20):
t = threading.Timer(i * 0.010, do_it, [i]) # I pass the parameter i to function do_it
t.start()
timers.append(t) # so that they can be cancelled if needed
wait_for_something_else() # this can last from 5 ms to 20 seconds
for t in timers[-20:]:
t.join()

do_it run in order and cancellable
run all do_it in one thread and sleep for the specific timing (may not with sleep)
use a variable "should_run_it" to check the do_it should run or not (cancellable?)
it's that something like this?
import threading
import time
def do_it(i):
print(f"[{i}] {time.time()}")
should_run_it = {i:True for i in range(20)}
def guard_do_it(i):
if should_run_it[i]:
do_it(i)
def run_do_it():
for i in range(20):
guard_do_it(i)
time.sleep(0.010)
if __name__ == "__main__":
t = threading.Timer(0.010, run_do_it)
start = time.time()
print(start)
t.start()
#should_run_it[5] = should_run_it[10] = should_run_it[15] = False # test
t.join()
end = time.time()
print(end)
print(end - start)

I don't have a ton of experience with threading in Python, so please go easy on me. The concurrent.futures library is a part of Python3 and it's dead simple. I'm providing an example for you so you can see how straightforward it is.
Concurrent.futures with exactly one thread for do_it() and concurrency:
import concurrent.futures
import time
def do_it(iteration):
time.sleep(0.1)
print('do it counter', iteration)
def wait_for_something_else():
time.sleep(1)
print('waiting for something else')
def single_thread():
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as executor:
futures = (executor.submit(do_it, i) for i in range(20))
for future in concurrent.futures.as_completed(futures):
future.result()
def do_asap():
wait_for_something_else()
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = [executor.submit(single_thread), executor.submit(do_asap)]
for future in concurrent.futures.as_completed(futures):
future.result()
The code above uses max_workers=1 threads to execute do_it() in a single thread. On line 13, do_it() is constrained to a single thread using the option max_workers=1 to limit the work to exactly one thread.
On line 22, both methods are submitted to the concurrent.futures thread pool executor. The code from lines 21-24 enables both methods to run in a thread pool and do_it runs on a single non-blocking thread.
The concurrent.futures doc describes how to control the number of threads. When max_workers is not specified, the total number of threads assigned to both processes is max_workers = min(32, os.cpu_count() + 4).

Proper approach to hang a Python program [duplicate]

I am writing an queue processing application which uses threads for waiting on and responding to queue messages to be delivered to the app. For the main part of the application, it just needs to stay active. For a code example like:
while True:
pass
or
while True:
time.sleep(1)
Which one will have the least impact on a system? What is the preferred way to do nothing, but keep a python app running?

I would imagine time.sleep() will have less overhead on the system. Using pass will cause the loop to immediately re-evaluate and peg the CPU, whereas using time.sleep will allow the execution to be temporarily suspended.
EDIT: just to prove the point, if you launch the python interpreter and run this:
>>> while True:
... pass
...
You can watch Python start eating up 90-100% CPU instantly, versus:
>>> import time
>>> while True:
... time.sleep(1)
...
Which barely even registers on the Activity Monitor (using OS X here but it should be the same for every platform).

Why sleep? You don't want to sleep, you want to wait for the threads to finish.
So
# store the threads you start in a your_threads list, then
for a_thread in your_threads:
a_thread.join()
See: thread.join

If you are looking for a short, zero-cpu way to loop forever until a KeyboardInterrupt, you can use:
from threading import Event
Event().wait()
Note: Due to a bug, this only works on Python 3.2+. In addition, it appears to not work on Windows. For this reason, while True: sleep(1) might be the better option.
For some background, Event objects are normally used for waiting for long running background tasks to complete:
def do_task():
sleep(10)
print('Task complete.')
event.set()
event = Event()
Thread(do_task).start()
event.wait()
print('Continuing...')
Which prints:
Task complete.
Continuing...

signal.pause() is another solution, see https://docs.python.org/3/library/signal.html#signal.pause
Cause the process to sleep until a signal is received; the appropriate handler will then be called. Returns nothing. Not on Windows. (See the Unix man page signal(2).)

I've always seen/heard that using sleep is the better way to do it. Using sleep will keep your Python interpreter's CPU usage from going wild.

You don't give much context to what you are really doing, but maybe Queue could be used instead of an explicit busy-wait loop? If not, I would assume sleep would be preferable, as I believe it will consume less CPU (as others have already noted).
[Edited according to additional information in comment below.]
Maybe this is obvious, but anyway, what you could do in a case where you are reading information from blocking sockets is to have one thread read from the socket and post suitably formatted messages into a Queue, and then have the rest of your "worker" threads reading from that queue; the workers will then block on reading from the queue without the need for neither pass, nor sleep.

Running a method as a background thread with sleep in Python:
import threading
import time
class ThreadingExample(object):
""" Threading example class
The run() method will be started and it will run in the background
until the application exits.
"""
def __init__(self, interval=1):
""" Constructor
:type interval: int
:param interval: Check interval, in seconds
"""
self.interval = interval
thread = threading.Thread(target=self.run, args=())
thread.daemon = True # Daemonize thread
thread.start() # Start the execution
def run(self):
""" Method that runs forever """
while True:
# Do something
print('Doing something imporant in the background')
time.sleep(self.interval)
example = ThreadingExample()
time.sleep(3)
print('Checkpoint')
time.sleep(2)
print('Bye')

Basic python threading is not working. What am I missing in this?

I am trying to use python threading and am having problems getting the threads to work independently. They seem to be running in sequential order and waiting for one to finish before starting to process the next thread. I have read other posts suggesting that I need to get more work into the threads to differentiate actual CPU work vs the CPU work of starting and managing the threads, and that a sleep timer could be used to simulate this. So I tried that and then measured the task durations.
So my code is below. It first runs three tasks sequentially with a 2 second timer. This takes about 6 seconds to run as expected. The next section starts three threads and they should run roughly in parallel if my understanding of threading is correct. I have played with the timers to test the overall duration of this section of code, expecting that if one timer is larger than the other two, the code will execute in an interval closest to that larger one. but what I am seeing is that is taking the same amount of time as the three running in sequence - one after the other.
I got onto this because I am writing some code to read an asynchronous queue in the background. After launching the thread to read the queue, my code seems to stop and wait until the queue reader is stopped, which it normally doesn't as it waits for messages to come in. So what happens is that it never executes the next section of code and it seems to be waiting for the thread to complete.
Also I checked the number of threads active and it remains at the same number, and when I check for the thread ID in the code (not shown) I get the same thread number coming back for every thread.
I am new to python and am using the jupyter compiler environment. Is there a compile option or some other limitation that I am not aware of that is preventing the threading? Am I just not getting the concept? I dont believe that this is related to CPU cores / threading as it would be done through logical thread cores within the python compiled code. I also ran a similar program in a command shell environment and got the same sequential performance.
Cut and paste this code to see what it does. What am I missing?
'''
import threading
import logging
import datetime
import time
import random
class Parallel:
def work(self, interval):
time.sleep(interval)
name = self.__repr__()
print (name, " is complete after ", interval, " seconds")
# SetupLogger()
logging.getLogger().setLevel(logging.DEBUG)
logging.debug("thread program start time is %s", datetime.datetime.now())
thread1 = Parallel()
thread2 = Parallel()
thread3 = Parallel()
print ("sequential threads::")
thread1.work(2.0)
thread2.work(2.0)
thread3.work(2.0)
logging.info("parallel threads start time is %s ", datetime.datetime.now())
start = time.time()
work1 = threading.Thread(target=thread1.work(1), daemon=True)
work1.start()
print ("thread 1 is started and there are ", threading.activeCount(), " threads active")
work2 = threading.Thread(target=thread2.work(2), daemon=False)
work2.start()
print ("thread 2 is started and there are ", threading.activeCount(), " threads active")
work3 = threading.Thread(target=thread3.work(5), daemon=False)
work3.start()
print ("thread 3 is started and there are ", threading.activeCount(), " threads active")
# wait for all to complete
print ("now wait for all to finish at ", datetime.datetime.now())
work1.join()
work2.join()
work3.join()
end = time.time()
logging.info ("parallel threads end time is %s with %s elapsed", datetime.datetime.now(), str(end-start))
print ("all threads completed at:", datetime.datetime.now())
'''

In the line that initializes the thread, you are actually executing the function instead of passing its reference to the thread.
thread1.work() ----> this will actually execute the function when the program runs and encounters this statement
So when your program reaches this line,
work1 = threading.Thread(target=thread1.work(1), daemon=True)
and encounters target=thread1.work(1), it simply calls the function right there and the actual thread does nothing.
thread1.work is a reference to the function, which you need to pass to your Thread object.
So just remove the parenthesis and your code becomes
work1 = threading.Thread(target=thread1.work, daemon=True, args=(1,))
and this will behave as you expect.

timer interrupt thread python

I've been trying to make a precise timer in python, or as precise a OS allows it to be. But It seems to be more complicated than I initially thought.
This is how I would like it to work:
from time import sleep
from threading import Timer
def do_this():
print ("hello, world")
t = Timer(4, do_this)
t.start()
sleep(20)
t.cancel()
Where during 20 seconds I would execute 'do_this' every fourth second. However 'do_this' executes once then the script terminates after 20 seconds.
Another way would be to create a thread with a while loop.
import time
import threading
import datetime
shutdown_event = threading.Event()
def dowork():
while not shutdown_event.is_set():
print(datetime.datetime.now())
time.sleep(1.0)
def main():
t = threading.Thread(target=dowork, args=(), name='worker')
t.start()
print("Instance started")
try:
while t.isAlive():
t.join(timeout=1.0)
except (KeyboardInterrupt, SystemExit):
shutdown_event.set()
pass
if __name__ == '__main__':
main()
This thread executes as expected but I get a timing drift. In this case have to compensate for the time it takes to execute the code in the while loop by adjusting the sleep accordingly.
Is there a simple way in python to execute a timer every second (or any interval) without introducing a drift compared to the system time without having to compensate the sleep(n) parameter?
Thanks for helping,
/Anders

If dowork() always runs in less time than your intervals, you can spawn a new thread every 4 seconds in a loop:
def dowork():
wlen = random.random()
sleep(wlen) # Emulate doing some work
print 'work done in %0.2f seconds' % wlen
def main():
while 1:
t = threading.Thread(target=dowork)
time.sleep(4)
If dowork() could potentially run for more than 4 seconds, then in your main loop you want to make sure the previous job is finished before spawning a new one.
However, time.sleep() can itself drift because no guarantees are made on how long the thread will actually be suspended. The correct way of doing it would be to figure out how long the job took and sleep for the remaining of the interval. I think this is how UI and game rendering engines work, where they have to display fixed number of frames per second at fixed times and rendering each frame could take different length of time to complete.

Non blocking python process or thread

I have a simple app that listens to a socket connection. Whenever certain chunks of data come in a callback handler is called with that data. In that callback I want to send my data to another process or thread as it could take a long time to deal with. I was originally running the code in the callback function, but it blocks!!
What's the proper way to spin off a new task?

threading is the threading library usually used for resource-based multithreading. The multiprocessing library is another library, but designed more for running intensive parallel computing tasks; threading is generally the recommended library in your case.
Example
import threading, time
def my_threaded_func(arg, arg2):
print "Running thread! Args:", (arg, arg2)
time.sleep(10)
print "Done!"
thread = threading.Thread(target=my_threaded_func, args=("I'ma", "thread"))
thread.start()
print "Spun off thread"

The multiprocessing module has worker pools. If you don't need a pool of workers, you can use Process to run something in parallel with your main program.

import threading
from time import sleep
import sys
# assume function defs ...
class myThread (threading.Thread):
def __init__(self, threadID):
threading.Thread.__init__(self)
self.threadID = threadID
def run(self):
if self.threadID == "run_exe":
run_exe()
def main():
itemList = getItems()
for item in itemList:
thread = myThread("run_exe")
thread.start()
sleep(.1)
listenToSocket(item)
while (thread.isAlive()):
pass # a way to wait for thread to finish before looping
main()
sys.exit(0)
The sleep between thread.start() and listenToSocket(item) ensures that the thread is established before you begin to listen. I implemented this code in a unit test framework were I had to launch multiple non-blacking processes (len(itemList) number of times) because my other testing framework (listenToSocket(item)) was dependent on the processes.
un_exe() can trigger a subprocess call that can be blocking (i.e. invoking pipe.communicate()) so that output data from the execution will still be printed in time with the python script output. But the nature of threading makes this ok.
So this code solves two problems - print data of a subprocess without blocking script execution AND dynamically create and start multiple threads sequentially (makes maintenance of the script better if I ever add more items to my itemList later).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python understanding the scheduler - python

Related

How to avoid to start hundreds of threads when starting (very short) actions at different timings in the future

Proper approach to hang a Python program [duplicate]

Basic python threading is not working. What am I missing in this?

timer interrupt thread python

Non blocking python process or thread

Categories

Resources