python threads aren't terminating after target method has finished executing - python

I'm having some troubles with python threads. I'm writing a software package that plots data received from multiple devices. I have a plot thread that plots the data once it has received a set of data from all devices, and a data retrieval thread for each device. The application plots data continuously (as fast as data can be retrieved from the device) until the user hits a button. I have a threading.Event() self.stop_thread that is checked frequently to back out of the threaded loops. The threads hit the check, break out of the loop, but are still 'running' according to my debugger and threading.active_count(). Does anyone know why this is happening and how can I get it to stop? I need to know these threads are gone before I move on to another function of the application. The following three methods are where the issues arise.
# initalizes startup settings, starts a thread to carry out
# plotting and a seperate thread to carry out data retrieval
def start_plot_threads(self):
if not self.abstraction.connected:
self.connect_to_device()
if not self.abstraction.connected:
return
self.stop_thread.clear()
self.pause_thread.clear()
for device in self.devices:
device.pause_thread.clear()
device.stop_thread.clear()
device.change_units.set()
self.presentation.enable_derivative()
self.presentation.show_average_button.SetValue(False)
self.presentation.show_average_button.Disable()
self.abstraction.multi_plot_data = {}
try:
if self.plot_thread.is_alive():
return
except Exception:
pass
self.plot_thread = Thread(target=self.plot_data)
self.plot_thread.daemon = True
self.plot_thread.start()
for device in self.devices:
thread = Thread(target=self.retrieve_data,
kwargs={'device': device},
name="Data Retrieval Thread %s" % device.instr_id)
thread.daemon = True
thread.start()
# waits for plot data to be thrown on a thread safe queue by the data
# retrieval thread and plots it. data comes in as a tuple of the form
# (y_data, label, x_data)
def plot_data(self):
multiplot = False
if len(self.devices) > 1:
multiplot = True
plot_data = []
while not self.stop_thread.is_set():
try:
data = self.plot_data_queue.get()
except Empty:
pass
else:
if multiplot:
scan = {}
scan['y_data'] = [data[0]]
scan['labels'] = [data[1]]
scan['x_data'] = data[2]
plot_data.append(scan)
if len(plot_data) == len(self.devices):
self.presentation.plot_multiline(plot_data, average=False)
self.abstraction.multi_plot_data = plot_data
plot_data = []
else:
self.presentation.plot_signal(data[0], data[1])
# the intent is that the data retrieval thread stays in this loop while
# taking continuous readings
def retrieve_data(self, device):
while True:
if device.stop_thread.is_set():
return
while device.pause_thread.is_set():
if device.stop_thread.is_set():
return
sleep(0.1)
y = self.get_active_signal_data(device)
if not y:
return
self.plot_data_queue.put(
(y, device.name, device.x_data))
self.abstraction.y_data = [y]
try:
self.update_spectrum(device)
except DeviceCommunicationError, data:
self.presentation.give_connection_error(data)
self.presentation.integ_time = device.prev_integ
I apologize for the extra bulk in the methods. They are straight from my code base.

The reason why your threads continue running is unknown- device.stop_thread.is_set(): (What is doing the setting??)
However you can guarantee that all your threads have stopped by retaining a handler on each thread ( by appending each thread object to a list) and once you have started all your threads you can then proceed to thread.join() them.
threads = []
for job in batch:
thr = threading.Thread(target=do_job, args = (job))
thr.start()
threads.append(thr)
#join all the threads
for thr in threads:
thr.join()
Join will wait for the thread to complete before moving on.
Python Docs:
https://docs.python.org/2/library/threading.html

Related

How to separately start and stop multiprocessing processes in Python?

I use a dedicated Python (3.8) library to control a motor drive via a USB port.
The Python library provided by the motor control drive manufacturers (ODrive) allows a single Python process to control one or more drives.
However, I would like to run 3 processes, each controlling 1 drive.
After researching options (I first considered virtual machines, Docker containers, and multi-threading) I began believing that the easiest way to do so would be to use multiprocessing.
My problem is that I would then need a way to manage (i.e., start, monitor, and stop independently) multiple processes. The practical reason behind it is that motors are connected to different setups. Each setup must be able to be stopped and restarted separate if malfunctioning, for instance, but other running setups should not be affected by this action.
After reading around the internet and Stack Overflow, I now understand how to create a Pool of processing, how to associate processes with processor cores, how to start a pool of processes, and queuing/joining them (the latter not being needed for me).
What I don't know is how to manage them independently.
How can I separately start/stop different processes without affecting the execution of the others?
Are there libraries to manage them (perhaps even with a GUI)?
I'd probably do something like this:
import random
import time
from multiprocessing import Process, Queue
class MotorProcess:
def __init__(self, name, com_related_params):
self.name = name
# Made up some parameters relating to communication
self._params = com_related_params
self._command_queue = Queue()
self._status_queue = Queue()
self._process = None
def start(self):
if self._process and self._process.is_alive():
return
self._process = Process(target=self.run_processing,
args=(self._command_queue, self._status_queue,
self._params))
self._process.start()
#staticmethod
def run_processing(command_queue, status_queue, params):
while True:
# Check for commands
if not command_queue.empty():
msg = command_queue.get(block=True, timeout=0.05)
if msg == "stop motor":
status_queue.put("Stopping motor")
elif msg == "exit":
return
elif msg.startswith("move"):
status_queue.put("moving motor to blah")
# TODO: msg parsing and move motor
else:
status_queue.put("unknown command")
# Update status
# TODO: query motor status
status_queue.put(f"Motor is {random.randint(0, 100)}")
time.sleep(0.5)
def is_alive(self):
if self._process and self._process.is_alive():
return True
return False
def get_status(self):
if not self.is_alive():
return ["not running"]
# Empty the queue
recent = []
while not self._status_queue.empty():
recent.append(self._status_queue.get(False))
return recent
def stop_process(self):
if not self.is_alive():
return
self._command_queue.put("exit")
# Empty the stats queue otherwise it could potentially stop
# the process from closing.
while not self._status_queue.empty():
self._status_queue.get()
self._process.join()
def send_command(self, command):
self._command_queue.put(command)
if __name__ == "__main__":
processes = [MotorProcess("1", None), MotorProcess("2", None)]
while True:
cmd = input()
if cmd == "start 1":
processes[0].start()
elif cmd == "move 1 to 100":
processes[0].send_command("move to 100")
elif cmd == "exit 1":
processes[0].stop_process()
else:
for n, p in enumerate(processes):
print(f"motor {n + 1}", end="\n\t")
print("\n\t".join(p.get_status()))
Not production ready (e.g. no exception handling, no proper command parsing, etc.) but shows the idea.
Shout if there are any problems :D
You can create multiple multriprocessing.Process instances manually like this:
def my_func(a, b):
pass
p = multiprocessing.Process(target=my_func, args=(100, 200)
p.start()
and manage them using multiprocessing primitives Queue, Event, Condition etc. Please refer to the official documentation for details: https://docs.python.org/3/library/multiprocessing.html
In the following example multiple processes are started and stopped independently. Event is used to determine when to stop a process. Queue is used for results passing from the child processes to the main process.
import multiprocessing
import queue
import random
import time
def worker_process(
process_id: int,
results_queue: multiprocessing.Queue,
to_stop: multiprocessing.Event,
):
print(f"Process {process_id} is started")
while not to_stop.is_set():
print(f"Process {process_id} is working")
time.sleep(0.5)
result = random.random()
results_queue.put((process_id, result))
print(f"Process {process_id} exited")
process_pool = []
result_queue = multiprocessing.Queue()
while True:
if random.random() < 0.3:
# staring a new process
process_id = random.randint(0, 10_000)
to_stop = multiprocessing.Event()
p = multiprocessing.Process(
target=worker_process, args=(process_id, result_queue, to_stop)
)
p.start()
process_pool.append((p, to_stop))
if random.random() < 0.2:
# closing a random process
if process_pool:
process, to_stop = process_pool.pop(
random.randint(0, len(process_pool) - 1)
)
to_stop.set()
process.join()
try:
p_id, result = result_queue.get_nowait()
print(f"Completed: process_id={p_id} result={result}")
except queue.Empty:
pass
time.sleep(1)

Dynamically generating new threads

I want to be able to run multiple threads without actually making a new line for every thread I want to run. In the code below I cannot dynamically add more accountIDs, or increase the #of threads just by changing the count on thread_count
For example this is my code now:
import threading
def get_page_list(account,thread_count):
return list_of_pages_split_by_threads
def pull_data(page_list,account_id):
data = api(page_list,account_id)
return data
if __name__ == "__main__":
accountIDs = [100]
#of threads to make:
thread_count = 3
#Returns a list of pages ie : [[1,2,3],[4,5,6],[7,8,9,10]]
page_lists = get_page_list(accountIDs[0],thread_count)
t1 = threading.Thread(target=pull_data, args=(page_list[0],accountIDs[0]))
t2 = threading.Thread(target=pull_data, args=(page_list[1],accountIDs[0]))
t3 = threading.Thread(target=pull_data, args=(page_list[2],accountIDs[0]))
t1.start()
t2.start()
t3.start()
t1.join()
t2.join()
t3.join()
This is where I want to get to:
Anytime I want to add an additional thread if the server can handle it or add additional accountIDs I dont have to reproduce the code?
IE (This example is what I would like to do, but the below doesnt work it tries to finish a whole list of pages before moving on to the next thread)
if __name__ == "__main__":
accountIDs = [100,101,103]
thread_count = 3
for account in accountIDs:
page_lists = get_page_list(account,thread_count)
for pg_list in page_list:
t1 = threading.Thread(target=pull_data, args=(pg_list,account))
t1.start()
t1.join()
One way of doing it is using Pool and Queue.
The pool will keep working while there are items in the queue, without holding the main thread.
Chose one of these imports:
import multiprocessing as mp (for process based parallelization)
import multiprocessing.dummy as mp (for thread based parallelization)
Creating the workers, pool and queue:
the_queue = mp.Queue() #store the account ids and page lists here
def worker_main(queue):
while waiting == True:
while not queue.empty():
account, pageList = queue.get(True) #get an id from the queue
pull_data(pageList, account)
waiting = True
the_pool = mp.Pool(num_parallel_workers, worker_main,(the_queue,))
# don't forget the coma here ^
accountIDs = [100,101,103]
thread_count = 3
for account in accountIDs:
list_of_page_lists = get_page_list(account, thread_count)
for pg_list in page_list:
the_queue.put((account, pg_list))
....
waiting = False #while you don't do this, the pool will probably never end.
#not sure if it's a good practice, but you might want to have
#the pool hanging there for a while to receive more items
the_pool.close()
the_pool.join()
Another option is to fill the queue first, create the pool second, use the worker only while there are items in the queue.
Then if more data arrives, you create another queue, another pool:
import multiprocessing.dummy as mp
#if you are not using dummy, you will probably need a queue for the results too
#as the processes will not access the vars from the main thread
#something like worker_main(input_queue, output_queue):
#and pull_data(pageList,account,output_queue)
#and mp.Pool(num_parallel_workers, worker_main,(in_queue,out_queue))
#and you get the results from the output queue after pool.join()
the_queue = mp.Queue() #store the account ids and page lists here
def worker_main(queue):
while not queue.empty():
account, pageList = queue.get(True) #get an id from the queue
pull_data(pageList, account)
accountIDs = [100,101,103]
thread_count = 3
for account in accountIDs:
list_of_page_lists = get_page_list(account, thread_count)
for pg_list in page_list:
the_queue.put((account, pg_list))
the_pool = mp.Pool(num_parallel_workers, worker_main,(the_queue,))
# don't forget the coma here ^
the_pool.close()
the_pool.join()
del the_queue
del the_pool
I couldn't get MP to work correctly so I did this instead and it seems to work great. But MP is probably the better way to tackle this problem
#Just keeps track of the threads
threads = []
#Generates a thread for whatever variable thread_count = N
for thread in range(thread_count):
#function retrns a list of pages stored in page_listS, this ensures each thread gets a unique list.
page_list = page_lists[thread]
#actual fucntion for each thread to work
t = threading.Thread(target=pull_data, args=(account,thread))
#puts all threads into a list
threads.append(t)
#runs all the treads up
t.start()
#After all threads are complete back to the main thread.. technically this is not needed
for t in threads:
t.join()
I also didn't understand why you would "need" .join() great answer here:
what is the use of join() in python threading

My process finishes its `run` function, but it doesn't die

I'm subclassing multiprocessing.Process to create a class that will asynchronously grab images from a camera and push them to some queues for display and saving to disk.
The problem I'm having is that when I issue a stop command using a multiprocessing.Event object that belongs to the Process-descendant-object, the process successfully completes the last line of the run function, but then it doesn't die. The process just continues to exist and continues to return true from the is_alive function. I don't understand how this could be possible. What would cause a process to complete its run function but not die?
Maddeningly, when I separated this object from the larger context I'm using it in (which includes several other Process subclasses also running simultaneously), I can't reproduce the problem, which tends to make me think it has something to do with the rest of the code, but I don't understand how that could be - if it executed the last line of the run function, shouldn't it die regardless of what else is going on? I must be misunderstanding something about how a Process object works.
Here's the code below. When I run it, I see the message "Video acquires process STOPPED" printed out, but the process doesn't die.
class VideoAcquirer(mp.Process):
def __init__(self, camSerial, imageQueue, monitorImageQueue, acquireSettings={}, monitorFrameRate=15):
mp.Process.__init__(self, daemon=True)
self.camSerial = camSerial
self.acquireSettings = acquireSettings
self.imageQueue = imageQueue
self.monitorImageQueue = monitorImageQueue
self.videoFrequencyEntry.get()Rate = monitorFrameRate
self.stop = mp.Event()
def stopProcess(self):
print('Stopping video acquire process')
self.stop.set()
def run(self):
system = PySpin.System.GetInstance()
camList = system.GetCameras()
cam = camList.GetBySerial(self.camSerial)
cam.Init()
nodemap = cam.GetNodeMap()
setCameraAttributes(nodemap, self.acquireSettings)
cam.BeginAcquisition()
monitorFramePeriod = 1.0/self.monitorFrameRate
print("Video monitor frame period:", monitorFramePeriod)
lastTime = time.time()
k = 0
im = imp = imageResult = None
print("Image acquisition begins now!")
while not self.stop.is_set():
try:
# Retrieve next received image
print(1)
imageResult = cam.GetNextImage(100) # Timeout of 100 ms to allow for stopping process
print(2)
# Ensure image completion
if imageResult.IsIncomplete():
print('Image incomplete with image status %d...' % imageResult.GetImageStatus())
else:
# Print image information; height and width recorded in pixels
width = imageResult.GetWidth()
height = imageResult.GetHeight()
k = k + 1
print('Grabbed Image %d, width = %d, height = %d' % (k, width, height))
im = imageResult.Convert(PySpin.PixelFormat_Mono8, PySpin.HQ_LINEAR)
imp = PickleableImage(im.GetWidth(), im.GetHeight(), 0, 0, im.GetPixelFormat(), im.GetData())
self.imageQueue.put(imp)
# Put the occasional image in the monitor queue for the UI
thisTime = time.time()
if (thisTime - lastTime) >= monitorFramePeriod:
# print("Sent frame for monitoring")
self.monitorImageQueue.put((self.camSerial, imp))
lastTime = thisTime
imageResult.Release()
print(3)
except PySpin.SpinnakerException as ex:
pass # Hopefully this is just because there was no image in camera buffer
# print('Error: %s' % ex)
# traceback.print_exc()
# return False
# Send stop signal to write process
print(4)
self.imageQueue.put(None)
camList.Clear()
cam.EndAcquisition()
cam.DeInit()
print(5)
del cam
system.ReleaseInstance()
del nodemap
del imageResult
del im
del imp
del camList
del system
print("Video acquire process STOPPED")
I start the process from a tkinter GUI thread roughly like this:
import multiprocessing as mp
camSerial = '2318921'
queue = mp.Queue()
videoMonitorQueue = mp.Queue()
acquireSettings = [('AcquisitionMode', 'Continuous'), ('TriggerMode', 'Off'), ('TriggerSource', 'Line0'), ('TriggerMode', 'On')]
v = VideoAcquirer(camSerial, queue, videoMonitorQueue, acquireSettings=acquireSettings, monitorFrameRate=15)
And here's roughly how I stop the process, also from the tkinter GUI thread:
v.stopProcess()
Thanks for your help.

How to implement a dynamic amount of concurrent threads?

I am launching concurrent threads doing some stuff:
concurrent = 10
q = Queue(concurrent * 2)
for j in range(concurrent):
t = threading.Thread(target=doWork)
t.daemon = True
t.start()
try:
# process each line and assign it to an available thread
for line in call_file:
q.put(line)
q.join()
except KeyboardInterrupt:
sys.exit(1)
At the same time I have a distinct thread counting time:
def printit():
threading.Timer(1.0, printit).start()
print current_status
printit()
I would like to increase (or decrease) the amount of concurrent threads for the main process let's say every minute. I can make a time counter in the time thread and make it do things every minute but how to change the amount of concurrent threads in the main process ?
Is it possible (and if yes how) to do that ?
This is my worker:
def UpdateProcesses(start,processnumber,CachesThatRequireCalculating,CachesThatAreBeingCalculated,CacheDict,CacheLock,IdleLock,FileDictionary,MetaDataDict,CacheIndexDict):
NewPool()
while start[processnumber]:
IdleLock.wait()
while len(CachesThatRequireCalculating)>0 and start[processnumber] == True:
CacheLock.acquire()
try:
cacheCode = CachesThatRequireCalculating[0] # The list can be empty if an other process takes the last item during the CacheLock
CachesThatRequireCalculating.remove(cacheCode)
print cacheCode,"starts processing by",processnumber,"process"
except:
CacheLock.release()
else:
CacheLock.release()
CachesThatAreBeingCalculated.append(cacheCode[:3])
Array,b,f = TIPP.LoadArray(FileDictionary[cacheCode[:2]])#opens the dask array
Array = ((Array[:,:,CacheIndexDict[cacheCode[:2]][cacheCode[2]]:CacheIndexDict[cacheCode[:2]][cacheCode[2]+1]].compute()/2.**(MetaDataDict[cacheCode[:2]]["Bit Depth"])*255.).astype(np.uint16)).transpose([1,0,2]) #slices and calculates the array
f.close() #close the file
if CachesThatAreBeingCalculated.count(cacheCode[:3]) != 0: #if not, this cache is not needed annymore (the cacheCode is removed bij wavelengthchange)
CachesThatAreBeingCalculated.remove(cacheCode[:3])
try: #If the first time the object if not aivalable try a second time
CacheDict[cacheCode[:3]] = Array
except:
CacheDict[cacheCode[:3]] = Array
print cacheCode,"done processing by",processnumber,"process"
if start[processnumber]:
IdleLock.clear()
This is how I start them:
self.ProcessLst = [] #list with all the processes who calculate the caches
for processnumber in range(min(NumberOfMaxProcess,self.processes)):
self.ProcessTerminateLst.append(True)
for processnumber in range(min(NumberOfMaxProcess,self.processes)):
self.ProcessLst.append(process.Process(target=Proc.UpdateProcesses,args=(self.ProcessTerminateLst,processnumber,self.CachesThatRequireCalculating,self.CachesThatAreBeingCalculated,self.CacheDict,self.CacheLock,self.IdleLock,self.FileDictionary,self.MetaDataDict,self.CacheIndexDict,)))
self.ProcessLst[-1].daemon = True
self.ProcessLst[-1].start()
I close them like this:
for i in range(len(self.ProcessLst)): #For both while loops in the processes self.ProcessTerminateLst[i] must be True. So or the process is now ready to be terminad or is still in idle mode.
self.ProcessTerminateLst[i] = False
self.IdleLock.set() #Makes sure no process is in Idle and all are ready to be terminated
I would use a pool. a pool has a max number of threads it uses at the same time, but you can apply inf number of jobs. They stay in the waiting list until a thread is available. I don't think you can change number of current processes in the pool.

multiprocessing script to scan for new values and put in queue not working

Here is my script:
# globals
MAX_PROCESSES = 50
my_queue = Manager().Queue() # queue to store our values
stop_event = Event() # flag which signals processes to stop
my_pool = None
def my_function(var):
while not stop_event.is_set():
#this script will run forever for each variable found
return
def var_scanner():
# Since `t` could have unlimited size we'll put all `t` value in queue
while not stop_event.is_set(): # forever scan `values` for new items
x = Variable.objects.order_by('var').values('var__var')
for t in x:
t = t.values()
my_queue.put(t)
time.sleep(10)
try:
var_scanner = Process(target=var_scanner)
var_scanner.start()
my_pool = Pool(MAX_PROCESSES)
while not stop_event.is_set():
try: # if queue isn't empty, get value from queue and create new process
var = my_queue.get_nowait() # getting value from queue
p = Process(target=my_function, args=("process-%s" % var))
p.start()
except Queue.Empty:
print "No more items in queue"
except KeyboardInterrupt as stop_test_exception:
print(" CTRL+C pressed. Stopping test....")
stop_event.set()
However I don't think this script is exactly what I want. Here's what I was looking for when I wrote the script. I want it to scan for variables in "Variables" table, add "new" variables if they don't already exists to the queue, run "my_function" for each variable in the queue.
I believe I have WAYYYY to many while not stop_event.is_set() functions. Because right now it just prints out "No more items in queue" about a million times.
Please HELP!! :)

Categories