Tracking application launched with a python file

Tracking application launched with a python file - python

I've encountered a situation where I thought it would be a good idea to
create a launcher for an application which I tend to run several instances
of. This is to ensure that I and the application get access to the wanted
environment variables that can be provided and set for each instance.
import os
import subprocess
def launch():
"""
Launches application.
"""
# create environment
os.environ['APPLICATION_ENVIRON'] = 'usr/path'
# launch application
application_path = 'path/to/application'
app = subprocess.Popen([application_path])
pid = app.pid
app.wait()
print 'done with process: {}'.format(pid)
if __name__ == '__main__':
launch()
I want to be able to track the applications, do I dump the pids in a file and
remove them when the process closes? Do I launch a service that I communicate
with?
Being fairly new to programming in general I don't know if I'm missing a term
in the lingo or just thinking wrong. But I was reading up on Daemons and
services to track the applications and couldn't come up with a proper
answer. Put simply, a bit lost how to approach it.

What you're doing already seems reasonable. I'd probably extend it to something like this:
import os
import subprocess
def launch_app():
os.environ['APPLICATION_ENVIRON'] = 'usr/path'
application_path = 'path/to/application'
return subprocess.Popen([application_path])
def _purge_finished_apps(apps):
still_running = set()
for app in apps:
return_code = app.poll()
if return_code is not None:
print " PID {} no longer running (return code {})".format(app.pid, return_code)
else:
still_running.add(app)
return still_running
def ui():
apps = set()
while True:
print
print "1. To launch new instance"
print "2. To view all instances"
print "3. To exit, terminating all running instances"
print "4. To exit, leaving instances running"
opt = int(raw_input())
apps = _purge_finished_apps(apps)
if opt == 1:
app = launch_app()
apps.add(app)
print " PID {} launched".format(app.pid)
elif opt == 2:
if not apps:
print "There are no instances running"
for app in apps:
print " PID {} running".format(app.pid)
elif opt == 3:
for app in apps:
print "Terminating PID {}".format(app.pid)
app.terminate()
for app in apps:
app.wait()
print "PID {} finished".format(app.pid)
return
elif opt == 4:
return
if __name__ == "__main__":
ui()

Here's a code sample to help illustrate how it might work for you.
Note that you can capture the stdout from the processes in real time in your host script; this might be useful if the program you're running uses the console.
(As a side note on the example: You probably would want to change the IP addresses: these are from my internal network. Be kind to any external sites you might want to use, please. Launching thousands of processes with the same target might be construed as a hostile gesture.)
(An additional side note on this example: It is conceivable that I will lose some of my time samples when evaluating the output pipe...if the subprocess writes it to the console piecemeal, it is conceivable that I might occasionally catch it exactly as it is partway done - meaning I might get half of the "time=xxms" statement, causing the RE to miss it. I've done a poor job of checking for this possibility (i.e. I couldn't be bothered for the example). This is one of the hazards of multiprocess/multithreaded programming that you'll need to be aware of if you do it much.)
# Subprocessor.py
#
# Launch a console application repeatedly and test its state.
#
import subprocess
import re
NUMBER_OF_PROCESSES_TO_OPEN = 3
DELAY_BETWEEN_CHECKS = 5
CMD = "ping"
ARGS = ([CMD, "-n", "8", "192.168.0.60"], [CMD, "-n", "12", "192.168.0.20"], [CMD, "-n", "4", "192.168.0.21"])
def go():
processes = {}
stopped = [False, False, False]
samples = [0]*NUMBER_OF_PROCESSES_TO_OPEN
times = [0.0]*NUMBER_OF_PROCESSES_TO_OPEN
print "Opening processes..."
for i in range(NUMBER_OF_PROCESSES_TO_OPEN):
# The next line creates a subprocess, this is a non-blocking call so
# the program will complete it more or less instantly.
newprocess = subprocess.Popen(args = ARGS[i], stdout = subprocess.PIPE)
processes[i] = newprocess
print " process {} open, pid == {}.".format(i, processes[i].pid)
# Build a regular expression to work with the stdout.
gettimere = re.compile("time=([0-9]*)ms")
while len(processes) > 0:
for i, p in processes.iteritems():
# Popen.poll() asks the process if it is still running - it is
# a non-blocking call that completes instantly.
isrunning = (p.poll() == None)
data = p.stdout.readline() # Get the stdout from the process.
matchobj = gettimere.search(data)
if matchobj:
for time in matchobj.groups():
samples[i] += 1
times[i] = (times[i] * (samples[i] - 1) + int(time)) / samples[i]
# If the process was stopped before we read the last of the
# data from its output pipe, flag it so we don't keep messing
# with it.
if not isrunning:
stopped[i] = True
print "Process {} stopped, pid == {}, average time == {}".format(i, processes[i].pid, times[i])
# This code segment deletes the stopped processes from the dict
# so we don't keep checking them (and know when to stop the main
# program loop).
for i in range(len(stopped)):
if stopped[i] and processes.has_key(i):
del processes[i]
if __name__ == '__main__':
go()

Related

How do you stop all running threads in Python when one fails?

I am currently working on a Python 3 script that is designed to stress test an ARP service. Currently, I am creating up to 255 "worker" threads and then having them send up to 2^15 packets to stress test the server. This is the main test script that I run:
if __name__ == '__main__':
for i in range(0, 8):
for j in range(0, 15):
print("Multithreading test with", 2**i, "workers and ", 2**j,
"packets:")
sleep(3)
try:
arp_load_test.__main__(2**i)
except:
print("Multithreading test failed at", 2**i, "workers and ",
2**j, "packets:")
break
print("Moving on to the multiprocessing test.")
sleep(10)
for i in range(0, 15):
print("Multiprocessing test with", 2**i, "workers:")
sleep(3)
try:
arp_load_test2.__main__(2**i)
except:
print("Multiprocessing test failed at", 2**i, "workers.")
print("\n\n\t\t\t\tDONE!")
break
The first code block tests the multithreading and the second does the same thing except with multiprocessing. arp_load_test.py is a multithreading version of arp_load_test2.py's multiprocessing. In the except part of each of the for loops I want to end the loop as soon as one of the threads fails. How do I do that? Here's the code for arp_load_test.py, and arp_load_test2.py is almost the exact same thing:
def __main__(threaders = 10, items = 10):
print("\tStarting threading main method\n")
sleep(1)
a = datetime.datetime.now()
# create a logger object
logging.basicConfig(filename = "arp_multithreading_log.txt",
format = "%(asctime)s %(message)s",
level = logging.INFO)
# default values
interface = "enp0s8"
workers = threaders
packets = items
dstip = "192.168.30.1"
# parse the command line
try:
opts, args = getopt.getopt(sys.argv[1:], "i:w:n:d:", ["interface=",
"workers=",
"packets=",
"dstip="])
except getopt.GetoptError as err:
print("Error: ", str(err), file = sys.stderr)
sys.exit(-1)
# override defaults with the options passed in on the command line
for o, a in opts:
if o in ("-i", "--interface"):
interface = a
elif o in ("-w", "--workers"):
w = int(a)
if w > 254:
workers = 254
print("Max worker threads is 254. Using 254 workers",
file = sys.stderr)
elif w < 1:
workers = 1
print("Min worker threads is 1. Using 1 worker",
file = sys.stderr)
else:
workers = w
elif o in ("-n", "--packets"):
packets = int(a)
elif o in ("-d", "--dstip"):
dstip = a
else:
assert False, "unhandled option"
# create an empty list as a thread pool
office = []
# give all the workers jobs in the office
for i in range(workers):
office.append(ArpTestThread(i, "ARP-" + str(i), i, interface, packets,
dstip))
# do all the work
logging.info("BEGIN ARP FLOOD TEST")
for worker in office:
worker.daemon = True
worker.start()
for worker in office:
worker.join()
b = datetime.datetime.now()
print("\tSent", len(office) * packets, "packets!\n")
print("It took", a - b, "seconds!")
logging.info("END ARP FLOOD TEST\n")
sleep(5)
##### end __main__
ArpTestThread is an child object of threading.Thread (or Process) that is set up to send the packets to the ARP service. Also, I'm running the test script from inside of a VM via Terminal, but I am not using any of the command line options the program is set up to use, I just added parameters instead b/c lazy.
Do I need to place a try block inside of the class file instead of the test script? I was given 90% of the class file code already complete, and am updating it and trying to collect data on what it does, along with optimizing it to properly stress the ARP service. I want the for loops in the test script (the very first portion of code in this post) to break, stop all currently running threads, and print out at what point the program failed as soon as one of the threads/processes crashes. Is that possible?
EDIT:
The suggested duplicate question does not solve my problem. I am trying to send packets, and it does not raise an exception until the program essentially runs out of memory to continue sending packets to the ARP service. I don’t get an exception until the program itself breaks, so the possible solution that suggested using a simple signal does not work.
The program can finish successfully. Threads/processes can (and should) get started, send a packet, and then close up. If something happens in any singular thread/process, I want everything currently running to stop and then essentially print out an error message to the console.

Python multiprocessing - AssertionError: can only join a child process

I'm taking my first foray into the python mutliprocessing module and I'm running into some problems. I'm very familiar with the threading module but I need to make sure the processes I'm executing are running in parallel.
Here's an outline of what I'm trying to do. Please ignore things like undeclared variables/functions because I can't paste my code in full.
import multiprocessing
import time
def wrap_func_to_run(host, args, output):
output.append(do_something(host, args))
return
def func_to_run(host, args):
return do_something(host, args)
def do_work(server, client, server_args, client_args):
server_output = func_to_run(server, server_args)
client_output = func_to_run(client, client_args)
#handle this output and return a result
return result
def run_server_client(server, client, server_args, client_args, server_output, client_output):
server_process = multiprocessing.Process(target=wrap_func_to_run, args=(server, server_args, server_output))
server_process.start()
client_process = multiprocessing.Process(target=wrap_func_to_run, args=(client, client_args, client_output))
client_process.start()
server_process.join()
client_process.join()
#handle the output and return some result
def run_in_parallel(server, client):
#set up commands for first process
server_output = client_output = []
server_cmd = "cmd"
client_cmd = "cmd"
process_one = multiprocessing.Process(target=run_server_client, args=(server, client, server_cmd, client_cmd, server_output, client_output))
process_one.start()
#set up second process to run - but this one can run here
result = do_work(server, client, "some server args", "some client args")
process_one.join()
#use outputs above and the result to determine result
return final_result
def main():
#grab client
client = client()
#grab server
server = server()
return run_in_parallel(server, client)
if __name__ == "__main__":
main()
Here's the error I'm getting:
Error in sys.exitfunc:
Traceback (most recent call last):
File "/usr/lib64/python2.7/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/usr/lib64/python2.7/multiprocessing/util.py", line 319, in _exit_function
p.join()
File "/usr/lib64/python2.7/multiprocessing/process.py", line 143, in join
assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
I've tried a lot of different things to fix this but my feeling is that there's something wrong with the way I'm using this module.
EDIT:
So I created a file that will reproduce this by simulating the client/server and the work they do - Also I missed an important point which was that I was running this in unix. Another important bit of information was that do_work in my actual case involves using os.fork(). I was unable to reproduce the error without also using os.fork() so I'm assuming the problem is there. In my real world case, that part of the code was not mine so I was treating it like a black box (likely a mistake on my part). Anyways here's the code to reproduce -
#!/usr/bin/python
import multiprocessing
import time
import os
import signal
import sys
class Host():
def __init__(self):
self.name = "host"
def work(self):
#override - use to simulate work
pass
class Server(Host):
def __init__(self):
self.name = "server"
def work(self):
x = 0
for i in range(10000):
x+=1
print x
time.sleep(1)
class Client(Host):
def __init__(self):
self.name = "client"
def work(self):
x = 0
for i in range(5000):
x+=1
print x
time.sleep(1)
def func_to_run(host, args):
print host.name + " is working"
host.work()
print host.name + ": " + args
return "done"
def do_work(server, client, server_args, client_args):
print "in do_work"
server_output = client_output = ""
child_pid = os.fork()
if child_pid == 0:
server_output = func_to_run(server, server_args)
sys.exit(server_output)
time.sleep(1)
client_output = func_to_run(client, client_args)
# kill and wait for server to finish
os.kill(child_pid, signal.SIGTERM)
(pid, status) = os.waitpid(child_pid, 0)
return (server_output == "done" and client_output =="done")
def run_server_client(server, client, server_args, client_args):
server_process = multiprocessing.Process(target=func_to_run, args=(server, server_args))
print "Starting server process"
server_process.start()
client_process = multiprocessing.Process(target=func_to_run, args=(client, client_args))
print "Starting client process"
client_process.start()
print "joining processes"
server_process.join()
client_process.join()
print "processes joined and done"
def run_in_parallel(server, client):
#set up commands for first process
server_cmd = "server command for run_server_client"
client_cmd = "client command for run_server_client"
process_one = multiprocessing.Process(target=run_server_client, args=(server, client, server_cmd, client_cmd))
print "Starting process one"
process_one.start()
#set up second process to run - but this one can run here
print "About to do work"
result = do_work(server, client, "server args from do work", "client args from do work")
print "Joining process one"
process_one.join()
#use outputs above and the result to determine result
print "Process one has joined"
return result
def main():
#grab client
client = Client()
#grab server
server = Server()
return run_in_parallel(server, client)
if __name__ == "__main__":
main()
If I remove the use of os.fork() in do_work I don't get the error and the code behaves like I would have expected it before (except for the passing of outputs which I've accepted as my mistake/misunderstanding). I can change the old code to not use os.fork() but I'd also like to know why this caused this problem and if there's a workable solution.
EDIT 2:
I started working on a solution that omits os.fork() before the accepted answer. Here's what I have with some tweaking to the amount of simulated work that can be done -
#!/usr/bin/python
import multiprocessing
import time
import os
import signal
import sys
from Queue import Empty
class Host():
def __init__(self):
self.name = "host"
def work(self, w):
#override - use to simulate work
pass
class Server(Host):
def __init__(self):
self.name = "server"
def work(self, w):
x = 0
for i in range(w):
x+=1
print x
time.sleep(1)
class Client(Host):
def __init__(self):
self.name = "client"
def work(self, w):
x = 0
for i in range(w):
x+=1
print x
time.sleep(1)
def func_to_run(host, args, w, q):
print host.name + " is working"
host.work(w)
print host.name + ": " + args
q.put("ZERO")
return "done"
def handle_queue(queue):
done = False
results = []
return_val = 0
while not done:
#try to grab item from Queue
tr = None
try:
tr = queue.get_nowait()
print "found element in queue"
print tr
except Empty:
done = True
if tr is not None:
results.append(tr)
for el in results:
if el != "ZERO":
return_val = 1
return return_val
def do_work(server, client, server_args, client_args):
print "in do_work"
server_output = client_output = ""
child_pid = os.fork()
if child_pid == 0:
server_output = func_to_run(server, server_args)
sys.exit(server_output)
time.sleep(1)
client_output = func_to_run(client, client_args)
# kill and wait for server to finish
os.kill(child_pid, signal.SIGTERM)
(pid, status) = os.waitpid(child_pid, 0)
return (server_output == "done" and client_output =="done")
def run_server_client(server, client, server_args, client_args, w, mq):
local_queue = multiprocessing.Queue()
server_process = multiprocessing.Process(target=func_to_run, args=(server, server_args, w, local_queue))
print "Starting server process"
server_process.start()
client_process = multiprocessing.Process(target=func_to_run, args=(client, client_args, w, local_queue))
print "Starting client process"
client_process.start()
print "joining processes"
server_process.join()
client_process.join()
print "processes joined and done"
if handle_queue(local_queue) == 0:
mq.put("ZERO")
def run_in_parallel(server, client):
#set up commands for first process
master_queue = multiprocessing.Queue()
server_cmd = "server command for run_server_client"
client_cmd = "client command for run_server_client"
process_one = multiprocessing.Process(target=run_server_client, args=(server, client, server_cmd, client_cmd, 400000000, master_queue))
print "Starting process one"
process_one.start()
#set up second process to run - but this one can run here
print "About to do work"
#result = do_work(server, client, "server args from do work", "client args from do work")
run_server_client(server, client, "server args from do work", "client args from do work", 5000, master_queue)
print "Joining process one"
process_one.join()
#use outputs above and the result to determine result
print "Process one has joined"
return_val = handle_queue(master_queue)
print return_val
return return_val
def main():
#grab client
client = Client()
#grab server
server = Server()
val = run_in_parallel(server, client)
if val:
print "failed"
else:
print "passed"
return val
if __name__ == "__main__":
main()
This code has some tweaked printouts just to see exactly what is happening. I used a multiprocessing.Queue to store and share outputs across the processes and back into my main thread to be handled. I think this solves the python portion of my problem but there's still some issues in the code I'm working on. The only other thing I can say is that the equivalent to func_to_run involves sending a command over ssh and grabbing any err along with the output. For some reason, this works perfectly fine for a command that has a low execution time, but not well for a command that has a much larger execution time/output. I tried simulating this with the drastically different work values in my code here but haven't been able to reproduce similar results.
EDIT 3
Library code I'm using (again not mine) uses Popen.wait() for the ssh commands and I just read this:
Popen.wait()
Wait for child process to terminate. Set and return returncode attribute.
Warning This will deadlock when using stdout=PIPE and/or stderr=PIPE and the >child process generates enough output to a pipe such that it blocks waiting for >the OS pipe buffer to accept more data. Use communicate() to avoid that.
I adjusted the code to not buffer and just print as it is received and everything works.

I can change the old code to not use os.fork() but I'd also like to know why this caused this problem and if there's a workable solution.
The key to understanding the problem is knowing exactly what fork() does. CPython docs state "Fork a child process." but this presumes you understand the C library call fork().
Here's what glibc's manpage says about it:
fork() creates a new process by duplicating the calling process. The new process, referred to as the child, is an exact duplicate of the calling process, referred to as the parent, except for the following points: ...
It's basically as if you took your program and made a copy of its program state (heap, stack, instruction pointer, etc) with small differences and let it execute independent of the original. When this child process exits naturally, it will use exit() and that will trigger atexit() handlers registered by the multiprocessing module.
What can you do to avoid it?
omit os.fork(): use multiprocessing instead, like you are exploring now
probably effective: import multiprocessing after executing fork(), only in the child or parent as necessary.
use _exit() in the child (CPython docs state, "Note The standard way to exit is sys.exit(n). _exit() should normally only be used in the child process after a fork().")
https://docs.python.org/2/library/os.html#os._exit

In addition to the excellent solution from Cain, if you're facing the same situation as I was, where you can't control how the subprocesses are created, you can try to unregister the atexit function in your subprocesses to get rid of these messages:
import atexit
from multiprocessing.util import _exit_function
atexit.unregister(_exit_function)
ATTENTION: This may lead to leakage. For instance, if your subprocesses have their own children, they won't be cleared. So clearify your situation and test thoroughly afterwards.

It seems to me that you are threading it one time too many. I would not thread it from run_in_parallel, but simply calling run_server_client with the proper arguments, because they will thread inside.

Automatic process monitoring/management with Python

Right, so I have a python process which is running constantly, maybe even on Supervisor. What is the best way to achieve the following monitoring?
Send an alert and restart if the process has crashed. I'd like to automatically receive a signal every time the process crashes and auto restart it.
Send an alert and restart if the process has gone stale, i.e. hasn't crunched anything for say 1 minute.
Restart on demand
I'd like the achieve all of the above through Python. I know Supervisord will do most of it, but I want to see if it can be done through Python itself.

I think what you are looking for is, Supervisor Events. http://supervisord.org/events.html
Also look at Superlance, its a package of plugin utilities for monitoring and controlling processes that run under supervisor.
[https://superlance.readthedocs.org/en/latest/]
You can configure stuff like Crash emails, Crash SMS, Memory consumption alerts, HTTP hooks etc.

Well, if you want a homegrown solution, this is what I could come up with.
Maintain the process state both actual and expected in redis. You can monitor it the way you want by making a web interface to check the actual state and change the expected state.
Run the python script in crontab to check for state and take appropriate action when required. Here I have checked for every 3 seconds and used SES to alert admins via email.
DISCLAIMER: The code has not been run or tested. I just wrote it now, so prone to errors.
open crontab file:
$crontab -e
Add this line at the end of it, to make the run_process.sh run every minute.
#Runs this process every 1 minute.
*/1 * * * * bash ~/path/to/run_monitor.sh
run_moniter.sh runs the python script. It runs in a for loop every 3 second.
This is done because crontab gives minimum time interval of 1 minute. We want to check for the process every 3 second, 20 times (3sec * 20 = 1 minute). So it will run for a minute before crontab runs it again.
run_monitor.sh
for count in {0..20}
do
cd '/path/to/check_status'
/usr/local/bin/python check_status.py "myprocessname" "python startcommand.py"
sleep 3 #check every 3 seconds.
done
Here I have assumed:
*state 0 = stop or stopped (expected vs. actual)
*state -1 = restart
*state 1 = run or running
You can add more states as per your convinience, stale process can also be a state.
I have used processname to kill or start or check processes, you can easily modify it to read specific PID files.
check_status.py
import sys
import redis
import subprocess
import sys
import boto.ses
def send_mail(recipients, message_subject, message_body):
"""
uses AWS SES to send mail.
"""
SENDER_MAIL = 'xxx#yyy.com'
AWS_KEY = 'xxxxxxxxxxxxxxxxxxx'
AWS_SECRET = 'xxxxxxxxxxxxxxxxxxx'
AWS_REGION = 'xx-xxxx-x'
mail_conn = boto.ses.connect_to_region(AWS_REGION,
aws_access_key_id=AWS_KEY,
aws_secret_access_key=AWS_SECRET
)
mail_conn.send_email(SENDER_MAIL, message_subject, message_body, recipient, format='html')
return True
class Shell(object):
'''
Convinient Wrapper over Subprocess.
'''
def __init__(self, command, raise_on_error=True):
self.command = command
self.output = None
self.error = None
self.return_code
def run(self):
try:
process = subprocess.Popen(self.command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
self.return_code = process.wait()
self.output, self.error = process.communicate()
if self.return_code and self.raise_on_error:
print self.error
raise Exception("Error while executing %s::%s"%(self.command, self.error))
except subprocess.CalledProcessError:
print self.error
raise Exception("Error while executing %s::%s"%(self.command, self.error))
redis_client = redis.Redis('xxxredis_hostxxx')
def get_state(process_name, state_type): #state_type will be expected or actual.
state = redis.get('{process_name}_{state_type}_state'.format(process_name=process_name, state_type=state_type)) #value could be 0 or 1
return state
def set_state(process_name, state_type, state): #state_type will be expected or actual.
state = redis.set('{process_name}_{state_type}_state'.format(process_name=process_name, state_type=state_type), state)
return state
def get_stale_state(process_name):
state = redis.get('{process_name}_stale_state'.format(process_name=process_name)) #value could be 0 or 1
return state
def check_running_status(process_name):
command = "ps -ef|grep {process_name}|wc -l".format(process_name=process_name)
shell = Shell(command = command)
shell.run()
if shell.output=='0':
return False
return True
def start_process(start_command): #pass start_command with a '&' so the process starts in the background.
shell = Shell(command = command)
shell.run()
def stop_process(process_name):
command = "ps -ef| grep {process_name}| awk '{print $2}'".format(process_name=process_name)
shell = Shell(command = command, raise_on_error=False)
shell.run()
if not shell.output:
return
process_ids = shell.output.strip().split()
for process_id in process_ids:
command = 'kill {process_id}'.format(process_id=process_id)
shell = Shell(command=command, raise_on_error=False)
shel.run()
def check_process(process_name, start_command):
expected_state = get_state(process_name, 'expected')
if expected_state == 0: #stop
stop_process(process_name)
set_state(process_name, 'actual', 0)
else if expected_state == -1: #restart
stop_process(process_name)
set_state(process_name, 'actual', 0)
start_process(start_command)
set_state(process_name, 'actual', 1)
set_state(process_name, 'expected', 1) #set expected back to 1 so we dont keep on restarting.
elif expected_state == 1:
running = check_running_status(process_name)
if not running:
set_state(process_name, 'actual', 0)
send_mail(reciepients=["abc#admin.com", "xyz#admin.com"], message_subject="Alert", message_body="Your process is Down. Trying to restart")
start_process(start_command)
running = check_running_status(process_name)
if running:
send_mail(reciepients=["abc#admin.com", "xyz#admin.com"], message_subject="Alert", message_body="Your process is was restarted.")
set_state(process_name, 'actual', 1)
else:
send_mail(reciepients=["abc#admin.com", "xyz#admin.com"], message_subject="Alert", message_body="Your process is could not be restarted.")
if __name__ == '__main__':
args = sys.argv[1:]
process_name = args[0]
start_command = args[1]
check_process(process_name, start_command)

Python threads hang and don't close

This is my first try with threads in Python,
I wrote the following program as a very simple example. It just gets a list and prints it using some threads. However, Whenever there is an error, the program just hangs in Ubuntu, and I can't seem to do anything to get the control prompt back, so have to restart another SSH session to get back in.
Also have no idea what the issue with my program is.
Is there some kind of error handling I can put in to ensure it doesn't hang.
Also, any idea why ctrl/c doesn't work (I don't have a break key)
from Queue import Queue
from threading import Thread
import HAInstances
import logging
log = logging.getLogger()
logging.basicConfig()
class GetHAInstances:
def oraHAInstanceData(self):
log.info('Getting HA instance routing data')
# HAData = SolrGetHAInstances.TalkToOracle.main()
HAData = HAInstances.main()
log.info('Query fetched ' + str(len(HAData)) + ' HA Instances to query')
# for row in HAData:
# print row
return(HAData)
def do_stuff(q):
while True:
print q.get()
print threading.current_thread().name
q.task_done()
oraHAInstances = GetHAInstances()
mainHAData = oraHAInstances.oraHAInstanceData()
q = Queue(maxsize=0)
num_threads = 10
for i in range(num_threads):
worker = Thread(target=do_stuff, args=(q,))
worker.setDaemon(True)
worker.start()
for row in mainHAData:
#print str(row[0]) + ':' + str(row[1]) + ':' + str(row[2]) + ':' + str(row[3])i
q.put((row[0],row[1],row[2],row[3]))
q.join()

In your thread method, it is recommended to use the "try ... except ... finally". This structure guarantees to return the control to the main thread even when errors occur.
def do_stuff(q):
while True:
try:
#do your works
except:
#log the error
finally:
q.task_done()
Also, in case you want to kill your program, go find out the pid of your main thread and use kill #pid to kill it. In Ubuntu or Mint, use ps -Ao pid,cmd, in the output, you can find out the pid (first column) by searching for the command (second column) you yourself typed to run your Python script.

Your q is hanging because your worker as errored. So your q.task_done() never got called.
import threading
to use
print threading.current_thread().name

How to prevent a process from terminating on a KeyboardInterrupt?

I've been messing around with a Django project.
What I want to achieve is the Django project starting up in another process while the parent process initiates a load of arbitary code I have written (the backend of my project). Obviously, the Django process and parent processes communicate. I'd like a dictionary to be read and written to by the processes.
I have the following code, based upon examples from here:
#!/usr/bin/env python
from multiprocessing import Process, Manager
import os
import time
from dj import manage
def django(d, l):
print "starting django"
d[1] = '1'
d['2'] = 2
d[0.25] = None
l.reverse()
manage.start()
def stop(d, l):
print "stopping"
print d
print l
if (__name__ == '__main__'):
os.system('clear')
print "starting backend..."
time.sleep(1)
print "backend start complete."
manager = Manager()
d = manager.dict()
l = manager.list(range(10))
p = Process(target=django, args=(d, l))
p.start()
try:
p.join()
except KeyboardInterrupt:
print "interrupt detected"
stop(d, l)
When I hit CTRL+C to kill the Django process, I'm seeing the Django server shut down, and stop() being called. Then what I want to see is the dictionary, d, and list, l, being printed.
Output is:
starting backend...
backend start complete.
starting django
Validating models...
0 errors found
Django version 1.3, using settings 'dj.settings'
Development server is running at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
^Cinterrupt detected
stopping
<DictProxy object, typeid 'dict' at 0x141ae10; '__str__()' failed>
<ListProxy object, typeid 'list' at 0x1425090; '__str__()' failed>
It can't find the dictionary or list after the CTRL+C event. Has the Manager process been terminated when the SIGINT is issued? If it is, is there anyway to stop it from terminating there and terminating with the main process?
I hope this makes sense.
Any help greatly receieved.

Ok, as far I see no possibility to simply ignore exception. When you rise one, you always go straight into a "except" block if there is one. What I'm proposing here is something what will restart your django application on each ^C, but note, that there should be added some back door for leaving.
In theory, you can wrap each line with a try..except.. block and that would act like a restart of each line, what will not be as visible as restart of whole script. If anyone finds a really-working solution, I will be the first one to upvote him.
You can set all inside your if (__name__ == '__main__'): into o main function and leave something like this:
def main():
#all the code...
if (__name__ == '__main__'):
while True:
try:
main()
except KeyboardInterrupt:
pass

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Tracking application launched with a python file - python

Related

How do you stop all running threads in Python when one fails?

Python multiprocessing - AssertionError: can only join a child process

Automatic process monitoring/management with Python

Python threads hang and don't close

How to prevent a process from terminating on a KeyboardInterrupt?

Categories

Resources