How do you stop all running threads in Python when one fails? - python

I am currently working on a Python 3 script that is designed to stress test an ARP service. Currently, I am creating up to 255 "worker" threads and then having them send up to 2^15 packets to stress test the server. This is the main test script that I run:
if __name__ == '__main__':
for i in range(0, 8):
for j in range(0, 15):
print("Multithreading test with", 2**i, "workers and ", 2**j,
"packets:")
sleep(3)
try:
arp_load_test.__main__(2**i)
except:
print("Multithreading test failed at", 2**i, "workers and ",
2**j, "packets:")
break
print("Moving on to the multiprocessing test.")
sleep(10)
for i in range(0, 15):
print("Multiprocessing test with", 2**i, "workers:")
sleep(3)
try:
arp_load_test2.__main__(2**i)
except:
print("Multiprocessing test failed at", 2**i, "workers.")
print("\n\n\t\t\t\tDONE!")
break
The first code block tests the multithreading and the second does the same thing except with multiprocessing. arp_load_test.py is a multithreading version of arp_load_test2.py's multiprocessing. In the except part of each of the for loops I want to end the loop as soon as one of the threads fails. How do I do that? Here's the code for arp_load_test.py, and arp_load_test2.py is almost the exact same thing:
def __main__(threaders = 10, items = 10):
print("\tStarting threading main method\n")
sleep(1)
a = datetime.datetime.now()
# create a logger object
logging.basicConfig(filename = "arp_multithreading_log.txt",
format = "%(asctime)s %(message)s",
level = logging.INFO)
# default values
interface = "enp0s8"
workers = threaders
packets = items
dstip = "192.168.30.1"
# parse the command line
try:
opts, args = getopt.getopt(sys.argv[1:], "i:w:n:d:", ["interface=",
"workers=",
"packets=",
"dstip="])
except getopt.GetoptError as err:
print("Error: ", str(err), file = sys.stderr)
sys.exit(-1)
# override defaults with the options passed in on the command line
for o, a in opts:
if o in ("-i", "--interface"):
interface = a
elif o in ("-w", "--workers"):
w = int(a)
if w > 254:
workers = 254
print("Max worker threads is 254. Using 254 workers",
file = sys.stderr)
elif w < 1:
workers = 1
print("Min worker threads is 1. Using 1 worker",
file = sys.stderr)
else:
workers = w
elif o in ("-n", "--packets"):
packets = int(a)
elif o in ("-d", "--dstip"):
dstip = a
else:
assert False, "unhandled option"
# create an empty list as a thread pool
office = []
# give all the workers jobs in the office
for i in range(workers):
office.append(ArpTestThread(i, "ARP-" + str(i), i, interface, packets,
dstip))
# do all the work
logging.info("BEGIN ARP FLOOD TEST")
for worker in office:
worker.daemon = True
worker.start()
for worker in office:
worker.join()
b = datetime.datetime.now()
print("\tSent", len(office) * packets, "packets!\n")
print("It took", a - b, "seconds!")
logging.info("END ARP FLOOD TEST\n")
sleep(5)
##### end __main__
ArpTestThread is an child object of threading.Thread (or Process) that is set up to send the packets to the ARP service. Also, I'm running the test script from inside of a VM via Terminal, but I am not using any of the command line options the program is set up to use, I just added parameters instead b/c lazy.
Do I need to place a try block inside of the class file instead of the test script? I was given 90% of the class file code already complete, and am updating it and trying to collect data on what it does, along with optimizing it to properly stress the ARP service. I want the for loops in the test script (the very first portion of code in this post) to break, stop all currently running threads, and print out at what point the program failed as soon as one of the threads/processes crashes. Is that possible?
EDIT:
The suggested duplicate question does not solve my problem. I am trying to send packets, and it does not raise an exception until the program essentially runs out of memory to continue sending packets to the ARP service. I don’t get an exception until the program itself breaks, so the possible solution that suggested using a simple signal does not work.
The program can finish successfully. Threads/processes can (and should) get started, send a packet, and then close up. If something happens in any singular thread/process, I want everything currently running to stop and then essentially print out an error message to the console.

Related

ProcessPoolExecutor, BrokenProcessPool handling

In this documentation ( https://pymotw.com/3/concurrent.futures/ ) it says:
"The ProcessPoolExecutor works in the same way as ThreadPoolExecutor, but uses processes instead of threads. This allows CPU-intensive operations to use a separate CPU and not be blocked by the CPython interpreter’s global interpreter lock."
This sounds great! It also says:
"If something happens to one of the worker processes to cause it to exit unexpectedly, the ProcessPoolExecutor is considered “broken” and will no longer schedule tasks."
This sounds bad :( So I guess my question is: What is considered "Unexpectedly?" Does that just mean the exit signal is not 1? Can I safely exit the thread and still keep processing a queue? The example is as follows:
from concurrent import futures
import os
import signal
with futures.ProcessPoolExecutor(max_workers=2) as ex:
print('getting the pid for one worker')
f1 = ex.submit(os.getpid)
pid1 = f1.result()
print('killing process {}'.format(pid1))
os.kill(pid1, signal.SIGHUP)
print('submitting another task')
f2 = ex.submit(os.getpid)
try:
pid2 = f2.result()
except futures.process.BrokenProcessPool as e:
print('could not start new tasks: {}'.format(e))
I hadn't see it IRL, but from the code it looks like the returned file descriptors not contains the results_queue file descriptor.
from concurrent.futures.process:
reader = result_queue._reader
while True:
_add_call_item_to_queue(pending_work_items,
work_ids_queue,
call_queue)
sentinels = [p.sentinel for p in processes.values()]
assert sentinels
ready = wait([reader] + sentinels)
if reader in ready: # <===================================== THIS
result_item = reader.recv()
else:
# Mark the process pool broken so that submits fail right now.
executor = executor_reference()
if executor is not None:
executor._broken = True
executor._shutdown_thread = True
executor = None
# All futures in flight must be marked failed
for work_id, work_item in pending_work_items.items():
work_item.future.set_exception(
BrokenProcessPool(
"A process in the process pool was "
"terminated abruptly while the future was "
"running or pending."
))
# Delete references to object. See issue16284
del work_item
the wait function depends on system, but assuming linux OS (at multiprocessing.connection, removed all timeout related code):
def wait(object_list, timeout=None):
'''
Wait till an object in object_list is ready/readable.
Returns list of those objects in object_list which are ready/readable.
'''
with _WaitSelector() as selector:
for obj in object_list:
selector.register(obj, selectors.EVENT_READ)
while True:
ready = selector.select(timeout)
if ready:
return [key.fileobj for (key, events) in ready]
else:
# some timeout code

Multiple stdout w/ flush going on in Python threading

I have a small piece of code that I made to test out and hopefully debug the problem without having to modify the code in my main applet in Python. This has let me to build this code:
#!/usr/bin/env python
import sys, threading, time
def loop1():
count = 0
while True:
sys.stdout.write('\r thread 1: ' + str(count))
sys.stdout.flush()
count = count + 1
time.sleep(.3)
pass
pass
def loop2():
count = 0
print ""
while True:
sys.stdout.write('\r thread 2: ' + str(count))
sys.stdout.flush()
count = count + 2
time.sleep(.3)
pass
if __name__ == '__main__':
try:
th = threading.Thread(target=loop1)
th.start()
th1 = threading.Thread(target=loop2)
th1.start()
pass
except KeyboardInterrupt:
print ""
pass
pass
My goal with this code is to be able to have both of these threads displaying output in stdout format (with flushing) at the same time and have then side by side or something. problem is that I assume since it is flushing each one, it flushes the other string by default. I don't quite know how to get this to work if it is even possible.
If you just run one of the threads, it works fine. However I want to be able to run both threads with their own string running at the same time in the terminal output. Here is a picture displaying what I'm getting:
terminal screenshot
let me know if you need more info. thanks in advance.
Instead of allowing each thread to output to stdout, a better solution is to have one thread control stdout exclusively. Then provide a threadsafe channel for the other threads to dispatch data to be output.
One good method to achieve this is to share a Queue between all threads. Ensure that only the output thread is accessing data after it has been added to the queue.
The output thread can store the last message from each other thread and use that data to format stdout nicely. This can include clearing output to display something like this, and update it as each thread generates new data.
Threads
#1: 0
#2: 0
Example
Some decisions were made to simplify this example:
There are gotchas to be wary of when giving arguments to threads.
Daemon threads terminate themselves when the main thread exits. They are used to avoid adding complexity to this answer. Using them on long-running or large applications can pose problems. Other
questions discuss how to exit a multithreaded application without leaking memory or locking system resources. You will need to think about how your program needs to signal an exit. Consider using asyncio to save yourself these considerations.
No newlines are used because \r carriage returns cannot clear the whole console. They only allow the current line to be rewritten.
import queue, threading
import time, sys
q = queue.Queue()
keepRunning = True
def loop_output():
thread_outputs = dict()
while keepRunning:
try:
thread_id, data = q.get_nowait()
thread_outputs[thread_id] = data
except queue.Empty:
# because the queue is used to update, there's no need to wait or block.
pass
pretty_output = ""
for thread_id, data in thread_outputs.items():
pretty_output += '({}:{}) '.format(thread_id, str(data))
sys.stdout.write('\r' + pretty_output)
sys.stdout.flush()
time.sleep(1)
def loop_count(thread_id, increment):
count = 0
while keepRunning:
msg = (thread_id, count)
try:
q.put_nowait(msg)
except queue.Full:
pass
count = count + increment
time.sleep(.3)
pass
pass
if __name__ == '__main__':
try:
th_out = threading.Thread(target=loop_output)
th_out.start()
# make sure to use args, not pass arguments directly
th0 = threading.Thread(target=loop_count, args=("Thread0", 1))
th0.daemon = True
th0.start()
th1 = threading.Thread(target=loop_count, args=("Thread1", 3))
th1.daemon = True
th1.start()
# Keep the main thread alive to wait for KeyboardInterrupt
while True:
time.sleep(.1)
except KeyboardInterrupt:
print("Ended by keyboard stroke")
keepRunning = False
for th in [th0, th1]:
th.join()
Example Output:
(Thread0:110) (Thread1:330)

Tracking application launched with a python file

I've encountered a situation where I thought it would be a good idea to
create a launcher for an application which I tend to run several instances
of. This is to ensure that I and the application get access to the wanted
environment variables that can be provided and set for each instance.
import os
import subprocess
def launch():
"""
Launches application.
"""
# create environment
os.environ['APPLICATION_ENVIRON'] = 'usr/path'
# launch application
application_path = 'path/to/application'
app = subprocess.Popen([application_path])
pid = app.pid
app.wait()
print 'done with process: {}'.format(pid)
if __name__ == '__main__':
launch()
I want to be able to track the applications, do I dump the pids in a file and
remove them when the process closes? Do I launch a service that I communicate
with?
Being fairly new to programming in general I don't know if I'm missing a term
in the lingo or just thinking wrong. But I was reading up on Daemons and
services to track the applications and couldn't come up with a proper
answer. Put simply, a bit lost how to approach it.
What you're doing already seems reasonable. I'd probably extend it to something like this:
import os
import subprocess
def launch_app():
os.environ['APPLICATION_ENVIRON'] = 'usr/path'
application_path = 'path/to/application'
return subprocess.Popen([application_path])
def _purge_finished_apps(apps):
still_running = set()
for app in apps:
return_code = app.poll()
if return_code is not None:
print " PID {} no longer running (return code {})".format(app.pid, return_code)
else:
still_running.add(app)
return still_running
def ui():
apps = set()
while True:
print
print "1. To launch new instance"
print "2. To view all instances"
print "3. To exit, terminating all running instances"
print "4. To exit, leaving instances running"
opt = int(raw_input())
apps = _purge_finished_apps(apps)
if opt == 1:
app = launch_app()
apps.add(app)
print " PID {} launched".format(app.pid)
elif opt == 2:
if not apps:
print "There are no instances running"
for app in apps:
print " PID {} running".format(app.pid)
elif opt == 3:
for app in apps:
print "Terminating PID {}".format(app.pid)
app.terminate()
for app in apps:
app.wait()
print "PID {} finished".format(app.pid)
return
elif opt == 4:
return
if __name__ == "__main__":
ui()
Here's a code sample to help illustrate how it might work for you.
Note that you can capture the stdout from the processes in real time in your host script; this might be useful if the program you're running uses the console.
(As a side note on the example: You probably would want to change the IP addresses: these are from my internal network. Be kind to any external sites you might want to use, please. Launching thousands of processes with the same target might be construed as a hostile gesture.)
(An additional side note on this example: It is conceivable that I will lose some of my time samples when evaluating the output pipe...if the subprocess writes it to the console piecemeal, it is conceivable that I might occasionally catch it exactly as it is partway done - meaning I might get half of the "time=xxms" statement, causing the RE to miss it. I've done a poor job of checking for this possibility (i.e. I couldn't be bothered for the example). This is one of the hazards of multiprocess/multithreaded programming that you'll need to be aware of if you do it much.)
# Subprocessor.py
#
# Launch a console application repeatedly and test its state.
#
import subprocess
import re
NUMBER_OF_PROCESSES_TO_OPEN = 3
DELAY_BETWEEN_CHECKS = 5
CMD = "ping"
ARGS = ([CMD, "-n", "8", "192.168.0.60"], [CMD, "-n", "12", "192.168.0.20"], [CMD, "-n", "4", "192.168.0.21"])
def go():
processes = {}
stopped = [False, False, False]
samples = [0]*NUMBER_OF_PROCESSES_TO_OPEN
times = [0.0]*NUMBER_OF_PROCESSES_TO_OPEN
print "Opening processes..."
for i in range(NUMBER_OF_PROCESSES_TO_OPEN):
# The next line creates a subprocess, this is a non-blocking call so
# the program will complete it more or less instantly.
newprocess = subprocess.Popen(args = ARGS[i], stdout = subprocess.PIPE)
processes[i] = newprocess
print " process {} open, pid == {}.".format(i, processes[i].pid)
# Build a regular expression to work with the stdout.
gettimere = re.compile("time=([0-9]*)ms")
while len(processes) > 0:
for i, p in processes.iteritems():
# Popen.poll() asks the process if it is still running - it is
# a non-blocking call that completes instantly.
isrunning = (p.poll() == None)
data = p.stdout.readline() # Get the stdout from the process.
matchobj = gettimere.search(data)
if matchobj:
for time in matchobj.groups():
samples[i] += 1
times[i] = (times[i] * (samples[i] - 1) + int(time)) / samples[i]
# If the process was stopped before we read the last of the
# data from its output pipe, flag it so we don't keep messing
# with it.
if not isrunning:
stopped[i] = True
print "Process {} stopped, pid == {}, average time == {}".format(i, processes[i].pid, times[i])
# This code segment deletes the stopped processes from the dict
# so we don't keep checking them (and know when to stop the main
# program loop).
for i in range(len(stopped)):
if stopped[i] and processes.has_key(i):
del processes[i]
if __name__ == '__main__':
go()

Python threads hang and don't close

This is my first try with threads in Python,
I wrote the following program as a very simple example. It just gets a list and prints it using some threads. However, Whenever there is an error, the program just hangs in Ubuntu, and I can't seem to do anything to get the control prompt back, so have to restart another SSH session to get back in.
Also have no idea what the issue with my program is.
Is there some kind of error handling I can put in to ensure it doesn't hang.
Also, any idea why ctrl/c doesn't work (I don't have a break key)
from Queue import Queue
from threading import Thread
import HAInstances
import logging
log = logging.getLogger()
logging.basicConfig()
class GetHAInstances:
def oraHAInstanceData(self):
log.info('Getting HA instance routing data')
# HAData = SolrGetHAInstances.TalkToOracle.main()
HAData = HAInstances.main()
log.info('Query fetched ' + str(len(HAData)) + ' HA Instances to query')
# for row in HAData:
# print row
return(HAData)
def do_stuff(q):
while True:
print q.get()
print threading.current_thread().name
q.task_done()
oraHAInstances = GetHAInstances()
mainHAData = oraHAInstances.oraHAInstanceData()
q = Queue(maxsize=0)
num_threads = 10
for i in range(num_threads):
worker = Thread(target=do_stuff, args=(q,))
worker.setDaemon(True)
worker.start()
for row in mainHAData:
#print str(row[0]) + ':' + str(row[1]) + ':' + str(row[2]) + ':' + str(row[3])i
q.put((row[0],row[1],row[2],row[3]))
q.join()
In your thread method, it is recommended to use the "try ... except ... finally". This structure guarantees to return the control to the main thread even when errors occur.
def do_stuff(q):
while True:
try:
#do your works
except:
#log the error
finally:
q.task_done()
Also, in case you want to kill your program, go find out the pid of your main thread and use kill #pid to kill it. In Ubuntu or Mint, use ps -Ao pid,cmd, in the output, you can find out the pid (first column) by searching for the command (second column) you yourself typed to run your Python script.
Your q is hanging because your worker as errored. So your q.task_done() never got called.
import threading
to use
print threading.current_thread().name

python watchdog for threads

Im writing simple app, which reads (about a million) lines from file, copy those lines into list, and if next line will be different then previous it runs a thread, to do some job with that list. Thread job is based on tcp sockets, sending and receiving commands via telnet lib.
Sometimes my application hangs and does nothing. All telnet operations I wrapped into try-except statements, also read and write into sockets has timeouts.
I thought about writing watchdog, which will do sys.exit() or something similiar on that hang condtition. But, for now I'm thinking how to create it, and still got no idea how to do it. So if You can trace me, it would be great.
For that file I'm creating 40 threads. Pseudo code looks:
lock = threading.Lock()
no_of_jobs = 0
class DoJob(threading.Thread):
def start(self, cond, work):
self.work = work
threading.Thread.start(self)
def run(self)
global lock
global no_of_jobs
lock.acquire()
no_of_jobs += 1
lock.release()
# do some job, if error or if finished, decrement no_of_jobs under lock
(...)
main:
#starting conditions:
with open(sys.argv[1]) as targetsfile:
head = [targetsfile.next() for x in xrange(1)]
s = head[0]
prev_cond = s[0]
work = []
for line in open(sys.argv[1], "r"):
cond = line([0])
if prev_cond != cond:
while(no_of_jobs>= MAX_THREADS):
time.sleep(1)
DoJob(cond, work)
prev_cond = cond
work = None
work = []
work.append(line)
#last job:
DoJob(cond, work)
while threading.activeCount() > 1:
time.sleep(1)
best regards
J
I have successfully used code like below in the past (from a python 3 program I wrote):
import threading
def die():
print('ran for too long. quitting.')
for thread in threading.enumerate():
if thread.isAlive():
try:
thread._stop()
except:
pass
sys.exit(1)
if __name__ == '__main__':
#bunch of app-specific code...
# setup max runtime
die = threading.Timer(2.0, die) #quit after 2 seconds
die.daemon = True
die.start()
#after work is done
die.cancel()

Categories