Update long process' progress via callback instead of polling

Update long process' progress via callback instead of polling - python

in my Python script I am triggering a long process (drive()) that is encapsulated into a class method:
car.py
import time
class Car(object):
def __init__(self, sleep_time_in_seconds, miles_to_drive):
self.sleep_time_in_seconds = sleep_time_in_seconds
self.miles_to_drive = miles_to_drive
def drive(self):
for mile in range(self.miles_to_drive):
print('driving mile #{}'.format(mile))
time.sleep(self.sleep_time_in_seconds)
app.py
from car import Car
sleep_time = 2
total_miles = 5
car = Car(sleep_time_in_seconds=sleep_time, miles_to_drive=total_miles)
car.drive()
def print_driven_distance_in_percent(driven_miles):
print("Driven distance: {}%".format(100 * driven_miles / total_miles))
In the main script app.py I'd like to know the progress of the drive() process. One way of solving this would be to create a loop that polls the current progress from the Car class. If the Car class would inherit from Thread - polling seems to be an expected pattern as far as I have googled...
I'm just curious whether it's possible to somehow notify the main script from within the Car class about the current progress.
I thought about maybe creating a wrapper class that I can pass as argument to the Car class, and the car instance then can call the wrapper class' print_progress function.
Or is there a more pythonic way to notify the caller script on demand?
Thanks
EDIT:
Based on Artiom Kozyrev's answer - this is what I wanted to achieve:
import time
from threading import Thread
from queue import Queue
def ask_queue(q):
"""
The function to control status of our status display thread
q - Queue - need to show status of task
"""
while True:
x = q.get() # take element from Queue
if x == "STOP":
break
print("Process completed in {} percents".format(x))
print("100% finished")
class MyClass:
"""My example class"""
def __init__(self, name, status_queue):
self.name = name
self.status_queue = status_queue
def my_run(self):
"""
The function we would like to monitor
"""
# th = Thread(target=MyClass.ask_queue, args=(self.status_queue,), ) # monitoring thread
# th.start() # start monitoring thread
for i in range(100): # start doing our main function we would like to monitor
print("{} {}".format(self.name, i))
if i % 5 == 0: # every 5 steps show status of progress
self.status_queue.put(i) # send status to Queue
time.sleep(0.1)
self.status_queue.put("STOP") # stop Queue
# th.join()
if __name__ == "__main__":
q = Queue()
th = Thread(target=ask_queue, args=(q,), ) # monitoring thread
th.start() # start monitoring thread
# tests
x = MyClass("Maria", q)
x.my_run()
th.join()
Thanks to all!!

Thanks for interesting question, typically you do not need to use status as a separate thread for the case, you can just print status in the method you would like to monitor, but for training purpose you solve the issue the follwoing way, please follow comments and feel free to ask:
import time
from threading import Thread
from queue import Queue
class MyClass:
"""My example class"""
def __init__(self, name, status_queue):
self.name = name
self.status_queue = status_queue
#staticmethod
def ask_queue(q):
"""
The function to control status of our status display thread
q - Queue - need to show status of task
"""
while True:
x = q.get() # take element from Queue
if x == "STOP":
break
print("Process completed in {} percents".format(x))
print("100% finished")
def my_run(self):
"""
The function we would like to monitor
"""
th = Thread(target=MyClass.ask_queue, args=(self.status_queue,), ) # monitoring thread
th.start() # start monitoring thread
for i in range(100): # start doing our main function we would like to monitor
print("{} {}".format(self.name, i))
if i % 5 == 0: # every 5 steps show status of progress
self.status_queue.put(i) # send status to Queue
time.sleep(0.1)
self.status_queue.put("STOP") # stop Queue
th.join()
if __name__ == "__main__":
# tests
x = MyClass("Maria", Queue())
x.my_run()
print("*" * 200)
x.my_run()

Related

How to allow a class's variables to be modified concurrently by multiple threads

I have a class (MyClass) which contains a queue (self.msg_queue) of actions that need to be run and I have multiple sources of input that can add tasks to the queue.
Right now I have three functions that I want to run concurrently:
MyClass.get_input_from_user()
Creates a window in tkinter that has the user fill out information and when the user presses submit it pushes that message onto the queue.
MyClass.get_input_from_server()
Checks the server for a message, reads the message, and then puts it onto the queue. This method uses functions from MyClass's parent class.
MyClass.execute_next_item_on_the_queue()
Pops a message off of the queue and then acts upon it. It is dependent on what the message is, but each message corresponds to some method in MyClass or its parent which gets run according to a big decision tree.
Process description:
After the class has joined the network, I have it spawn three threads (one for each of the above functions). Each threaded function adds items from the queue with the syntax "self.msg_queue.put(message)" and removes items from the queue with "self.msg_queue.get_nowait()".
Problem description:
The issue I am having is that it seems that each thread is modifying its own queue object (they are not sharing the queue, msg_queue, of the class of which they, the functions, are all members).
I am not familiar enough with Multiprocessing to know what the important error messages are; however, it is stating that it cannot pickle a weakref object (it gives no indication of which object is the weakref object), and that within the queue.put() call the line "self._sem.acquire(block, timeout) yields a '[WinError 5] Access is denied'" error. Would it be safe to assume that this failure in the queue's reference not copying over properly?
[I am using Python 3.7.2 and the Multiprocessing package's Process and Queue]
[I have seen multiple Q/As about having threads shuttle information between classes--create a master harness that generates a queue and then pass that queue as an argument to each thread. If the functions didn't have to use other functions from MyClass I could see adapting this strategy by having those functions take in a queue and use a local variable rather than class variables.]
[I am fairly confident that this error is not the result of passing my queue to the tkinter object as my unit tests on how my GUI modifies its caller's queue work fine]
Below is a minimal reproducible example for the queue's error:
from multiprocessing import Queue
from multiprocessing import Process
import queue
import time
class MyTest:
def __init__(self):
self.my_q = Queue()
self.counter = 0
def input_function_A(self):
while True:
self.my_q.put(self.counter)
self.counter = self.counter + 1
time.sleep(0.2)
def input_function_B(self):
while True:
self.counter = 0
self.my_q.put(self.counter)
time.sleep(1)
def output_function(self):
while True:
try:
var = self.my_q.get_nowait()
except queue.Empty:
var = -1
except:
break
print(var)
time.sleep(1)
def run(self):
process_A = Process(target=self.input_function_A)
process_B = Process(target=self.input_function_B)
process_C = Process(target=self.output_function)
process_A.start()
process_B.start()
process_C.start()
# without this it generates the WinError:
# with this it still behaves as if the two input functions do not modify the queue
process_C.join()
if __name__ == '__main__':
test = MyTest()
test.run()

Indeed - these are not "threads" - these are "processes" - while if you were using multithreading, and not multiprocessing, the self.my_q instance would be the same object, placed at the same memory space on the computer,
multiprocessing does a fork of the process, and any data in the original process (the one in execution in the "run" call) will be duplicated when it is used - so, each subprocess will see its own "Queue" instance, unrelated to the others.
The correct way to have various process share a multiprocessing.Queue object is to pass it as a parameter to the target methods. The simpler way to reorganize your code so that it works is thus:
from multiprocessing import Queue
from multiprocessing import Process
import queue
import time
class MyTest:
def __init__(self):
self.my_q = Queue()
self.counter = 0
def input_function_A(self, queue):
while True:
queue.put(self.counter)
self.counter = self.counter + 1
time.sleep(0.2)
def input_function_B(self, queue):
while True:
self.counter = 0
queue.put(self.counter)
time.sleep(1)
def output_function(self, queue):
while True:
try:
var = queue.get_nowait()
except queue.Empty:
var = -1
except:
break
print(var)
time.sleep(1)
def run(self):
process_A = Process(target=self.input_function_A, args=(queue,))
process_B = Process(target=self.input_function_B, args=(queue,))
process_C = Process(target=self.output_function, args=(queue,))
process_A.start()
process_B.start()
process_C.start()
# without this it generates the WinError:
# with this it still behaves as if the two input functions do not modify the queue
process_C.join()
if __name__ == '__main__':
test = MyTest()
test.run()
As you can see, since your class is not actually sharing any data through the instance's attributes, this "class" design does not make much sense for your application - but for grouping the different workers in the same code block.
It would be possible to have a magic-multiprocess-class that would have some internal method to actually start the worker-methods and share the Queue instance - so if you have a lot of those in a project, there would be a lot less boilerplate.
Something along:
from multiprocessing import Queue
from multiprocessing import Process
import time
class MPWorkerBase:
def __init__(self, *args, **kw):
self.queue = None
self.is_parent_process = False
self.is_child_process = False
self.processes = []
# ensure this can be used as a colaborative mixin
super().__init__(*args, **kw)
def run(self):
if self.is_parent_process or self.is_child_process:
# workers already initialized
return
self.queue = Queue()
processes = []
cls = self.__class__
for name in dir(cls):
method = getattr(cls, name)
if callable(method) and getattr(method, "_MP_worker", False):
process = Process(target=self._start_worker, args=(self.queue, name))
self.processes.append(process)
process.start()
# Setting these attributes here ensure the child processes have the initial values for them.
self.is_parent_process = True
self.processes = processes
def _start_worker(self, queue, method_name):
# this method is called in a new spawned process - attribute
# changes here no longer reflect attributes on the
# object in the initial process
# overwrite queue in this process with the queue object sent over the wire:
self.queue = queue
self.is_child_process = True
# call the worker method
getattr(self, method_name)()
def __del__(self):
for process in self.processes:
process.join()
def worker(func):
"""decorator to mark a method as a worker that should
run in its own subprocess
"""
func._MP_worker = True
return func
class MyTest(MPWorkerBase):
def __init__(self):
super().__init__()
self.counter = 0
#worker
def input_function_A(self):
while True:
self.queue.put(self.counter)
self.counter = self.counter + 1
time.sleep(0.2)
#worker
def input_function_B(self):
while True:
self.counter = 0
self.queue.put(self.counter)
time.sleep(1)
#worker
def output_function(self):
while True:
try:
var = self.queue.get_nowait()
except queue.Empty:
var = -1
except:
break
print(var)
time.sleep(1)
if __name__ == '__main__':
test = MyTest()
test.run()

python Thread.name is printing last thread created name

I'm a newbie to Python and learning about threads. I have created a sample Producer-Consumer code wherein I add a movie to a list in Producer thread and pop the front element from the same list in Consumer thread. The problem is while printing the items of the movie List along with thread name I'm getting incorrect thread name in Producer thread. This is my code
Producer.py
from threading import Thread
from threading import RLock
import time
class Producer(Thread):
def __init__(self):
Thread.__init__(self)
Thread.name = 'Producer'
self.movieList = list()
self.movieListLock = RLock()
def printMovieList(self):
self.movieListLock.acquire()
if len(self.movieList) > 0:
for movie in self.movieList:
print(Thread.name, movie)
print('\n')
self.movieListLock.release()
def pushMovieToList(self, movie):
self.movieListLock.acquire()
self.movieList.append(movie)
self.printMovieList()
self.movieListLock.release()
def run(self):
for i in range(6):
self.pushMovieToList('Avengers' + str(i + 1))
time.sleep(1)
Consumer.py
from threading import Thread
import time
class Consumer(Thread):
def __init__(self):
Thread.__init__(self)
Thread.name = 'Consumer'
self.objProducer = None
def popMovieFromList(self):
self.objProducer.movieListLock.acquire()
if len(self.objProducer.movieList) > 0:
movie = self.objProducer.movieList.pop(0)
print(Thread.name, ':', movie)
print('\n')
self.objProducer.movieListLock.release()
def run(self):
while True:
time.sleep(1)
self.popMovieFromList()
Main.py
from Producer import *
from Consumer import *
def main():
objProducer = Producer()
objConsumer = Consumer()
objConsumer.objProducer = objProducer
objProducer.start()
objConsumer.start()
objProducer.join()
objConsumer.join()
main()

I am not sure whether you solve this problem.
Hope my answer will be helpful.
You can check the document of threading.
Here it says that Thread.name may set same name for multiple thread.
name
A string used for identification purposes only. It has no semantics. Multiple threads may be given the same name. The initial name is set by the constructor.
I think Thread.name is a static variable so it shares in different thread.
If you want to set name of thread, you can set it in thread object like this:
class Producer(Thread):
def __init__(self):
Thread.__init__(self)
self.name= 'Producer'
and get it by threading.current_thread().name.
if len(self.movieList) > 0:
for movie in self.movieList:
print(threading.current_thread().name, movie)
Hope you enjoy it!

Creating a new process during run in Simpy taking more time than the real-time via a thread

My aim is to use simpy (3.0.10) RealtimeEnvironment with strick=True and be able to organise & control processes during continuous simulation.
By using threads I will be able to model more extensive outside simpy and keep the realtime part small and fast.
But causing delays is resulting into:
"RuntimeError: Simulation too slow for real time"
Little excercise:
- aprocess = class with simple yield timeout
- mainprocess with task=aprocess plus print env.now
- Execute mainprocess
- Next create a thread in mainprocess
- Daemonize the thread to let simpy continue
- Start thread taking more time than the real-time step
- Return when ready
I can't get this to work so apparently I'm doing something wrong here.
Looking forward to further insights with your help.
Here is the code:
import simpy
from threading import Thread
from time import sleep
# define a simple timeout process class
class aprocess(object):
def __init__(self, env, name = 'aname'):
self.env = env
self.name = name
def execute(self):
while True: # repeat each run step
print self.name, 'status now: %d at yield 1' %self.env.now
yield self.env.timeout(5)
print self.name, 'status now: %d at yield 2' %self.env.now
yield self.env.timeout(2)
# define the main process class
class mainprocess(object):
def __init__(self, env, name = 'aname'):
self.env = env
self.name = name
def printstatus(self):
while True:
print ('status now: %d' %self.env.now)
yield self.env.timeout(1)
def newtask(self, newTaskname = 'newtask'):
# Task2name= str(raw_input('give a name for task2:'))
# this will cause the function to delay longer than accepted by simpy for realtime:
sleep(3)
self.newTaskname=newTaskname
newTask=aprocess(self.env, self.newTaskname)
self.env.process(newTask.execute())
print 'THREAD: new Task process created'
def execute(self):
t = None
# processes are events to so they can call other processes:
self.env.process(self.printstatus())
Task1=aprocess(self.env, 'Task1')
self.env.process(Task1.execute())
print 'MAINPROCESS: Task 1 process created)'
yield self.env.timeout(6)
print "creating first thread"
t=Thread(target=self.newtask())
t.deamon = True # Daemonize thread
t.start()
while self.env.now<20:
if not t.is_alive: #check if completed
print 't.is_alive=flase: tread completed'
t.join() #not shure if this is requried, is_alive already activates the thread
yield self.env.timeout(1)
#run the simulation
env1 = simpy.rt.RealtimeEnvironment(initial_time=0, factor=1.0, strict=True)
Maintask = mainprocess(env1, 'Mainproc')
Mainproc = env1.process(Maintask.execute())
env1.run(until=Mainproc)

Can I assume my threads are done when threading.active_count() returns 1?

Given the following class:
from abc import ABCMeta, abstractmethod
from time import sleep
import threading
from threading import active_count, Thread
class ScraperPool(metaclass=ABCMeta):
Queue = []
ResultList = []
def __init__(self, Queue, MaxNumWorkers=0, ItemsPerWorker=50):
# Initialize attributes
self.MaxNumWorkers = MaxNumWorkers
self.ItemsPerWorker = ItemsPerWorker
self.Queue = Queue # For testing purposes.
def initWorkerPool(self, PrintIDs=True):
for w in range(self.NumWorkers()):
Thread(target=self.worker, args=(w + 1, PrintIDs,)).start()
sleep(1) # Explicitly wait one second for this worker to start.
def run(self):
self.initWorkerPool()
# Wait until all workers (i.e. threads) are done.
while active_count() > 1:
print("Active threads: " + str(active_count()))
sleep(5)
self.HandleResults()
def worker(self, id, printID):
if printID:
print("Starting worker " + str(id) + ".")
while (len(self.Queue) > 0):
self.scraperMethod()
if printID:
print("Worker " + str(id) + " is quiting.")
# Todo Kill is this Thread.
return
def NumWorkers(self):
return 1 # Simplified for testing purposes.
#abstractmethod
def scraperMethod(self):
pass
class TestScraper(ScraperPool):
def scraperMethod(self):
# print("I am scraping.")
# print("Scraping. Threads#: " + str(active_count()))
temp_item = self.Queue[-1]
self.Queue.pop()
self.ResultList.append(temp_item)
def HandleResults(self):
print(self.ResultList)
ScraperPool.register(TestScraper)
scraper = TestScraper(Queue=["Jaap", "Piet"])
scraper.run()
print(threading.active_count())
# print(scraper.ResultList)
When all the threads are done, there's still one active thread - threading.active_count() on the last line gets me that number.
The active thread is <_MainThread(MainThread, started 12960)> - as printed with threading.enumerate().
Can I assume that all my threads are done when active_count() == 1?
Or can, for instance, imported modules start additional threads so that my threads are actually done when active_count() > 1 - also the condition for the loop I'm using in the run method.

You can assume that your threads are done when active_count() reaches 1. The problem is, if any other module creates a thread, you'll never get to 1. You should manage your threads explicitly.
Example: You can put the threads in a list and join them one at a time. The relevant changes to your code are:
def __init__(self, Queue, MaxNumWorkers=0, ItemsPerWorker=50):
# Initialize attributes
self.MaxNumWorkers = MaxNumWorkers
self.ItemsPerWorker = ItemsPerWorker
self.Queue = Queue # For testing purposes.
self.WorkerThreads = []
def initWorkerPool(self, PrintIDs=True):
for w in range(self.NumWorkers()):
thread = Thread(target=self.worker, args=(w + 1, PrintIDs,))
self.WorkerThreads.append(thread)
thread.start()
sleep(1) # Explicitly wait one second for this worker to start.
def run(self):
self.initWorkerPool()
# Wait until all workers (i.e. threads) are done. Waiting in order
# so some threads further in the list may finish first, but we
# will get to all of them eventually
while self.WorkerThreads:
self.WorkerThreads[0].join()
self.HandleResults()

according to docs active_count() includes the main thread, so if you're at 1 then you're most likely done, but if you have another source of new threads in your program then you may be done before active_count() hits 1.
I would recommend implementing explicit join method on your ScraperPool and keeping track of your workers and explicitly joining them to main thread when needed instead of checking that you're done with active_count() calls.
Also remember about GIL...

Python GUI stays frozen waiting for thread code to finish running

I have a python GUI program that needs to do a same task but with several threads. The problem is that I call the threads but they don't execute parallel but sequentially. First one executes, it ends and then second one, etc. I want them to start independently.
The main components are:
1. Menu (view)
2. ProcesStarter (controller)
3. Process (controller)
The Menu is where you click on the "Start" button which calls a function at ProcesStarter.
The ProcesStarter creates objects of Process and threads, and starts all threads in a for-loop.
Menu:
class VotingFrame(BaseFrame):
def create_widgets(self):
self.start_process = tk.Button(root, text="Start Process", command=lambda: self.start_process())
self.start_process.grid(row=3,column=0, sticky=tk.W)
def start_process(self):
procesor = XProcesStarter()
procesor_thread = Thread(target=procesor.start_process())
procesor_thread.start()
ProcesStarter:
class XProcesStarter:
def start_process(self):
print "starting new process..."
# thread count
thread_count = self.get_thread_count()
# initialize Process objects with data, and start threads
for i in range(thread_count):
vote_process = XProcess(self.get_proxy_list(), self.get_url())
t = Thread(target=vote_process.start_process())
t.start()
Process:
class XProcess():
def __init__(self, proxy_list, url, browser_show=False):
# init code
def start_process(self):
# code for process
When I press the GUI button for "Start Process" the gui is locked until both threads finish execution.
The idea is that threads should work in the background and work in parallel.

you call procesor.start_process() immediately when specifying it as the target of the Thread:
#use this
procesor_thread = Thread(target=procesor.start_process)
#not this
procesor_thread = Thread(target=procesor.start_process())
# this is called right away ^
If you call it right away it returns None which is a valid target for Thread (it just does nothing) which is why it happens sequentially, the threads are not doing anything.

One way to use a class as the target of a thread is to use the class as the target, and the arguments to the constructor as args.
from threading import Thread
from time import sleep
from random import randint
class XProcesStarter:
def __init__(self, thread_count):
print ("starting new process...")
self._i = 0
for i in range(thread_count):
t = Thread(
target=XProcess,
args=(self.get_proxy_list(), self.get_url())
)
t.start()
def get_proxy_list(self):
self._i += 1
return "Proxy list #%s" % self._i
def get_url(self):
self._i += 1
return "URL #%d" % self._i
class XProcess():
def __init__(self, proxy_list, url, browser_show=False):
r = 0.001 * randint( 1, 5000)
sleep(r)
print (proxy_list)
print (url)
def main():
t = Thread( target=XProcesStarter, args=(4, ) )
t.start()
if __name__ == '__main__':
main()
This code runs in python2 and python3.
The reason is that the target of a Thread object must be a callable (search for "callable" and "__call__" in python documentation for a complete explanation).
Edit The other way has been explained in other people's answers (see Tadhg McDonald-Jensen).

I think your issue is that in both places you're starting threads, you're actually calling the method you want to pass as the target to the thread. That runs its code in the main thread (and tries to start the new thread on the return value, if any, once its done).
Try:
procesor_thread = Thread(target=procesor.start_process) # no () after start_process
And:
t = Thread(target=vote_process.start_process) # no () here either

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Update long process' progress via callback instead of polling - python

Related

How to allow a class's variables to be modified concurrently by multiple threads

python Thread.name is printing last thread created name

Creating a new process during run in Simpy taking more time than the real-time via a thread

Can I assume my threads are done when threading.active_count() returns 1?

Python GUI stays frozen waiting for thread code to finish running

Categories

Resources