Troubleshoot python daemon that quits unexpectedly?

Troubleshoot python daemon that quits unexpectedly? - python

What's the best way to monitor a python daemon to determine the cause of it quitting unexpectedly? Is strace my best option or is there something Python specific that does the job?

I would generally start by adding logging to it. At a minimum, have whatever is launching it capture stdout/stderr so that any stack traces are saved. Examine your except blocks to make sure you're not capturing exceptions silently.

You can use pdb:
python -m pdb myscript.py
Running your program like this will cause it to enter post-mortem debugging if it exits abnormally. If you have an idea where the problem is you can also use import pdb; pdb.set_trace() at the point you want to start debugging. Also logging profusely helps.

As the answer above, try to add Logging, however be carefull if you are using python-daemon module it will not work with Logging module when logging to a file, so you should make the logging manually to a file.
Also, make your daemon restarts after it has failed by running it inside a loop and catch exceptions inside the loop.
Example:
while True:
try:
log_to_file("starting daemon")
run_daemon()
log_to_file("daemon stopped unexpectedly")
sleep(1)
except Exception as err:
log_to_file(err)

Related

How to hard suspend or pause a python script after it runs so it doesn’t force close upon completion?

Hi so I’m working on a python script that involves a loop function, so far the loop function process is failing for some reason(although I kinda know why) but the problem I’ve got os.system(‘pause’) and also input(“prompt:”) at end of the code in order to pause all activity so I can read the error messages prior to script completion and termination but the script still shuts down, I need a way to HARD pause it or freeze before the window closes abruptly. Need help and any further insight.
Ps. Let me know if you need any more info to better describe this problem.

I assume you are just 'double clicking' the icon on Window Explorer. This has the disadvantage which you are encountering here in that the shell (terminal window) closes when the process finishes so you can't tell what went wrong if it terminated due to an error.
A better method would be to use the command prompt. If you are not familiar with this, there are many tutorials online.
The reason this will help with your problem is that, once navigating to the script's containing directory, you can use python your_script.py (assuming python is in your path environmental variable) to run the script within the same window.
Then, even if it fails, you can read the error messages as you will only be returned to the command line.
An alternative hacky method would be to create a script called something like run_pythons.py which will use the subprocess module to call your actual script in the same window, and then (no matter how it terminates), wait for your input before terminating itself so that you can read the error messages.
So something like:
import subprocess
subprocess.call(('python', input('enter script name: ')))
input('press ENTER to kill me')

I needed something like this at one point. I had a wrapper that loaded a bunch of modules and data and then waited for a prompt to run something. If I had a stupid mistake in a module, it would quit, and that time that it spent loading all that data into memory would be wasted, which was >1min. For me, I wanted a way to keep that data in memory even if I had an error in a module so that I could edit the module and rerun the script.
To do this:
while True:
update = raw_input("Paused. Enter = start, 'your input' = update params, C-C = exit")
if update:
update = update.split()
#unrelevant stuff used to parse my update
#custom thing to reload all my modules
fullReload()
try:
#my main script that needed all those modules and data loaded
model_starter.main(stuff, stuff2)
except Exception as e:
print(e)
traceback.print_exc()
continue
except KeyboardInterrupt:
print("I think you hit C-C. Do it again to exit.")
continue
except:
print("OSERROR? sys.exit()? who knows. C-C to exit.")
continue
This kept all the data loaded that I grabbed from before my while loop started, and prevented exiting on errors. It also meant that I could still ctrl+c to quit, I just had to do it from this wrapper instead of once it got to the main script.
Is this somewhat what you're looking for?
The answer is basically, you have to catch all your exceptions and have a method to restart your loop once you figured out and fixed the issue.

how to halt python program after pdb.set_trace()

When debugging scripts in Python (2.7, running on Linux) I occasionally inject pdb.set_trace() (note that I'm actually using ipdb), e.g.:
import ipdb as pdb
try:
do_something()
# I'd like to look at some local variables before running do_something_dangerous()
pdb.set_trace()
except:
pass
do_something_dangerous()
I typically run my script from the shell, e.g.
python my_script.py
Sometimes during my debugging session I realize that I don't want to run do_something_dangerous(). What's the easiest way to halt program execution so that do_something_dangerous() is not run and I can quit back to the shell?
As I understand it pressing ctrl-d (or issuing the debugger's quit command) will simply exit ipdb and the program will continue running (in my example above). Pressing ctrl-c seems to raise a KeyboardInterrupt but I've never understood the context in which it was raised.
I'm hoping for something like ctrl-q to simply take down the entire process, but I haven't been able to find anything.
I understand that my example is highly contrived, but my question is about how to abort execution from pdb when the code being debugged is set up to catch exceptions. It's not about how to restructure the above code so it works!

I found that ctrl-z to suspend the python/ipdb process, followed by 'kill %1' to terminate the process works well and is reasonably quick for me to type (with a bash alias k='kill %1'). I'm not sure if there's anything cleaner/simpler though.

From the module docs:
q(uit)
Quit from the debugger. The program being executed is aborted.
Specifically, this will cause the next debugger function that gets called to raise a BdbQuit exception.

Why is my threading/multiprocessing python script not exiting properly?

I have a server script that I need to be able to shutdown cleanly. While testing the usual try..except statements I realized that Ctrl-C didn't work the usual way. Normally I'd wrap long running tasks like this
try:
...
except KeyboardInterrupt:
#close the script cleanly here
so the task could be shutdown cleanly on Ctrl-C. I have never ran into any problems with this before, but somehow when I hit Ctrl-C when this particular script is running the script just exits without catching the Ctrl-C.
The initial version was implemented using Process from multiprocessing. I rewrote the script using Thread from threading, but same issue there. I have used threading many times before, but I am new to the multiprocessing library. Either way, I have never experienced this Ctrl-C behavior before.
Normally I have always implemented sentinels etc to close down Queues and Thread instances in an orderly fashion, but this script just exits without any response.
Last, I tried overriding signal.SIGINT as well like this
def handler(signal, frame):
print 'Ctrl+C'
signal.signal(signal.SIGINT, handler)
...
Here Ctrl+C was actually caught, but the handler doesn't execute, it never prints anything.
Besides the threading / multiprocessing aspect, parts of the script contains C++ SWIG objects. I don't know if that has anything to do with it. I am running Python 2.7.2 on OS X Lion.
So, a few questions:
What's going on here?
How can I debug this?
What do I need to learn in order to understand the root cause?
PLEASE NOTE: The internals of the script is proprietary so I can't give code examples. I am however very willing to receive pointers so I could debug this myself. I am experienced enough to be able to figure it out if someone could point me in the right direction.
EDIT: I started commenting out imports etc to see what caused the weird behavior, and I narrowed it down to an import of a C++ SWIG library. Any ideas why importing a C++ SWIG library 'steals' Ctrl-C? I am not the author of the guilty library however and my SWIG experience is limited so don't really know where to start...
EDIT 2: I just tried the same script on a windows machine, and in Windows 7 the Ctrl-C is caught as expected. I'm not really going to bother with the OS X part, the script will be run in an Windows environment anyway.

This might have to do with the way Python manages threads, signals and C calls.
In short - Ctrl-C cannot interrupt C calls, since the implementation requires that a python thread will handle the signal, and not just any thread, but the main thread (often blocked, waiting for other threads).
In fact, long operations can block everything.
Consider this:
>>> nums = xrange(100000000)
>>> -1 in nums
False (after ~ 6.6 seconds)
>>>
Now, Try hitting Ctrl-C (uninterruptible!)
>>> nums = xrange(100000000)
>>> -1 in nums
^C^C^C (nothing happens, long pause)
...
KeyboardInterrupt
>>>
The reason Ctrl-C doesn't work with threaded programs is that the main thread is often blocked on an uninterruptible thread-join or lock (e.g, any 'wait', 'join' or just a plain empty 'main' thread, which in the background causes python to 'join' on any spawned threads).
Try to insert a simple
while True:
time.sleep(1)
in your main thread.
If you have a long running C function, do signal handling in C-level (May the Force be with you!).
This is largely based on David Beazley's video on the subject.

It exits because something else is likely catching the KeyboardInterupt and then raising some other exception, or simply returning None. You should still get a traceback to help debug. You need to capture the stderr output or run your script with the -i commandline option so you can see traceback. Also, add another except block to catch all other exceptions.
If you suspect the C++ function call to be catching the CTRL+C try catching it's output. If the C function is not returning anything then there isn't much you can do except ask the author to add some exception handling, return codes, etc.
try:
#Doing something proprietary ...
#catch the function call output
result = yourCFuncCall()
#raise an exception if it's not what you expected
if result is None:
raise ValueError('Unexpected Result')
except KeyboardInterupt:
print('Must be a CTRL+C')
return
except:
print('Unhandled Exception')
raise

how about atexit?
http://docs.python.org/library/atexit.html#module-atexit

Handle assertion dialog box with python subprocess

I am using python to create a sub process to check and see that no assertions occur.
I want to catch the error output along with the return code. That works fine, but the problem I run into is that when it runs into the assertion it gives me a dialog box that just hangs there. I have to then click the assertion box before I retrieve any information. Is there a way to make it not pop up and continue with the program or to send a message to close the window?
This is a problem since this is an automation service.
import subprocess
pipe = subprocess.Popen('test2.exe', shell=True, stderr=subprocess.PIPE)
for line in pipe.stderr:
print line
The executable is compiled from c++ code and has an assertion that will fail for testing purposes.

There's not really an easy solution in general, since a program could in theory create any number of windows waiting for user input. If you have the source code for the inferior process, the easiest thing to do would be to modify it to call _set_abort_behavior(0, _CALL_REPORTFAULT) to disable the message box.
If you don't have the source code, it's going to be much, much tougher. You could probably write a big hack that did something like attaching a debugger to the inferior process and setting a breakpoint on the call to abort(). If that breakpoint gets hit, kill the process and return an appropriate error status. But that's an extreme non-trivial kludge.

As I mentioned in the comments, pop-ups for assertions isn't a normal thing in Python. If you can modify the code running in a subprocess, you might be able to capture the assertion and handle it yourself, rather than letting the environment handle it with a popup.
import sys
try:
code that raises the assertion
catch AssertionError, e:
sys.stderr.write("Assertion failed: " + str(e))
sys.exit(1)
If it's looking for assertions in particular, this should work, because it will capture the assertion, print an error message and then raise a simple SystemExit exception instead.

Killing Python webservers

I am looking for a simple Python webserver that is easy to kill from within code. Right now, I'm playing with Bottle, but I can't find any way at all to kill it in code. If you know how to kill Bottle (in code, no Ctrl+C) that would be super, but I'll take anything that's Python, simple, and killable.

We use this.
import os
os._exit(3)
To crash in a 'controlled' way.

If you want to kill a process from Python, on a Unix-like platform, you can send signals equivalent to Ctrl-C at the console using Pythons os module e.g.
# Get this processes PID
pid_of_process = os.getpid()
# Send the interrupt signal to this process
os.kill(pid_of_process, signal.SIGINT)

Raise exeption and handle it in main or use sys.exit

Try putting
import sys
at the top and the command
sys.exit(0)
In the code that handles the "kill request".

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.