I have a program using tensorflow on a non-supported hardware, so everytime i run it, i get the "Illegal instruction (Core dumped)" error
my main goal is to capture this error. i don't want to solve it.
The error is not printed to the stderr of my program, it's printed to the stderr of bash.
then my program exists with code 33792 which is 132 (SIGILL)
And i cannot capture it using the method mentioned here, because i'm running my command using docker run and i can't pass it the curly brackets
Is there any way to capture the stdout of bash without the curly brackets?
Also how exactly is SIGILL generated? what exactly is happening behind the scenes?
Is SIGILL triggered in the parent process (bash in my case) and passed to the child process (my program)? or vice versa?
i tried adding a SIGILL handler in my program to see if i can capture it, but my program froze instead of printing the "illegal instruction" error.
I'm using Debian 11 and my program is written in python.
Edit:
The SIGILL kills my python program and my goal is to capture the SIGILL from inside my program, print some error and kill my program afterward.
I don't want the (Illegal instruction) error printed to be printed in the bash's stderr, I want it to be printed to my program's stderr or stdout.
Edit: here's the sigill handler I have in my code
def sigill_handler(sig, frame):
print("Illegal Instruction. terminating.")
signal.signal(signal.SIGILL, sigill_handler)
notice that this is the only signal I'm handling in my code
Citing https://docs.python.org/3/library/signal.html:
Execution of Python signal handlers
A Python signal handler does not get executed inside the low-level (C) signal handler. Instead, the low-level signal handler sets a flag which tells the virtual machine to execute the corresponding Python signal handler at a later point(for example at the next bytecode instruction). This has consequences:
It makes little sense to catch synchronous errors like SIGFPE or SIGSEGV that are caused by an invalid operation in C code. Python will return from the signal handler to the C code, which is likely to raise the same signal again, causing Python to apparently hang. From Python 3.3 onwards, you can use the faulthandler module to report on synchronous errors.
A long-running calculation implemented purely in C (such as regular expression matching on a large body of text) may run uninterrupted for an arbitrary amount of time, regardless of any signals received. The Python signal handlers will be called when the calculation finishes.
If the handler raises an exception, it will be raised “out of thin air” in the main thread. See the note below for a discussion.
According to https://docs.python.org/3/library/faulthandler.html, all the faulthandler can do is to dump a stack trace, so it does not help for your requirement.
What you could do is to run your possibly failing program from your own wrapper program where you can check the wait status and decide what you display to the user if the program was killed by SIGILL.
It would be better to check if your program runs on a supported platform before using any tensorflow functions.
Related
This question is not about how to use sys.exit (or raising SystemExit directly), but rather about why you would want to use it.
If a program terminates successfully, I see no point in explicitly exiting at the end.
If a program terminates with an error, just raise that error. Why would you need to explicitly exit the program or why would you need an exit code?
Letting the program exit with an Exception is not user friendly. More exactly, it is perfectly fine when the user is a Python programmer, but if you provide a program to end users, they will expect nice error messages instead of a Python stacktrace which they will not understand.
In addition, if you use a GUI application (through tkinter or pyQt for example), the backtrace is likely to be lost, specially on Windows system. In that case, you will setup error processing which will provide the user with the relevant information and then terminate the application from inside the error processing routine. sys.exit is appropriate in that use case.
I am running a remote Python script on AWS (EC2 ubuntu) in background. The script performs some file manipulations, launches a long running simulation (subprocess run with os.system(...)) and writes some log files. I would like to manage the status of the running script and hopefully exit gracefully from various conditions. Specifically:
The sub-process is interrupted by the user with signal 15.
The simulation (sub-process) fails (signal 8 - Floating point exception)
The vm is rebooted
The vm is terminated. I am using Elastic File System, so even if the instance is destroyed, all the files are not.
I know how to handle basic exceptions, but I am a bit lost when I need to catch exceptions from subprocesses. Can you recommend a solid approach?
EDIT: Please notice the bold part.
For your given scenarios, try with signal handling. In given cases, case 1 (signal 15) and case 3 (vm is getting rebooted), are similar(generally signal 15/SIGTERM is part of shutdown sequence or maybe triggered by user with proper privileges. Nonetheless it serves the required purpose).
signal 8 - SIGFPE
import signal
def signalHandler(sigNum, frameObject):
if sigNum == 15:
# Code for handling signal 15 goes here
elif sigNum == 8:
# Code for handling signal 8 goes here
signal.signal(signal.SIGTERM, signalHandler) # signal 15
signal.signal(signal.SIGFPE, signalHandler) # signal 8
I might be misunderstanding you but just put all exception causing code in try-except blocks. You seem pretty knowledgeable but I'll give an example anyways
try:
//some potentially error causing code
except (errorType): //need to know what type of exception it will throw
//code for what to do if the error occurs
I have a server script that I need to be able to shutdown cleanly. While testing the usual try..except statements I realized that Ctrl-C didn't work the usual way. Normally I'd wrap long running tasks like this
try:
...
except KeyboardInterrupt:
#close the script cleanly here
so the task could be shutdown cleanly on Ctrl-C. I have never ran into any problems with this before, but somehow when I hit Ctrl-C when this particular script is running the script just exits without catching the Ctrl-C.
The initial version was implemented using Process from multiprocessing. I rewrote the script using Thread from threading, but same issue there. I have used threading many times before, but I am new to the multiprocessing library. Either way, I have never experienced this Ctrl-C behavior before.
Normally I have always implemented sentinels etc to close down Queues and Thread instances in an orderly fashion, but this script just exits without any response.
Last, I tried overriding signal.SIGINT as well like this
def handler(signal, frame):
print 'Ctrl+C'
signal.signal(signal.SIGINT, handler)
...
Here Ctrl+C was actually caught, but the handler doesn't execute, it never prints anything.
Besides the threading / multiprocessing aspect, parts of the script contains C++ SWIG objects. I don't know if that has anything to do with it. I am running Python 2.7.2 on OS X Lion.
So, a few questions:
What's going on here?
How can I debug this?
What do I need to learn in order to understand the root cause?
PLEASE NOTE: The internals of the script is proprietary so I can't give code examples. I am however very willing to receive pointers so I could debug this myself. I am experienced enough to be able to figure it out if someone could point me in the right direction.
EDIT: I started commenting out imports etc to see what caused the weird behavior, and I narrowed it down to an import of a C++ SWIG library. Any ideas why importing a C++ SWIG library 'steals' Ctrl-C? I am not the author of the guilty library however and my SWIG experience is limited so don't really know where to start...
EDIT 2: I just tried the same script on a windows machine, and in Windows 7 the Ctrl-C is caught as expected. I'm not really going to bother with the OS X part, the script will be run in an Windows environment anyway.
This might have to do with the way Python manages threads, signals and C calls.
In short - Ctrl-C cannot interrupt C calls, since the implementation requires that a python thread will handle the signal, and not just any thread, but the main thread (often blocked, waiting for other threads).
In fact, long operations can block everything.
Consider this:
>>> nums = xrange(100000000)
>>> -1 in nums
False (after ~ 6.6 seconds)
>>>
Now, Try hitting Ctrl-C (uninterruptible!)
>>> nums = xrange(100000000)
>>> -1 in nums
^C^C^C (nothing happens, long pause)
...
KeyboardInterrupt
>>>
The reason Ctrl-C doesn't work with threaded programs is that the main thread is often blocked on an uninterruptible thread-join or lock (e.g, any 'wait', 'join' or just a plain empty 'main' thread, which in the background causes python to 'join' on any spawned threads).
Try to insert a simple
while True:
time.sleep(1)
in your main thread.
If you have a long running C function, do signal handling in C-level (May the Force be with you!).
This is largely based on David Beazley's video on the subject.
It exits because something else is likely catching the KeyboardInterupt and then raising some other exception, or simply returning None. You should still get a traceback to help debug. You need to capture the stderr output or run your script with the -i commandline option so you can see traceback. Also, add another except block to catch all other exceptions.
If you suspect the C++ function call to be catching the CTRL+C try catching it's output. If the C function is not returning anything then there isn't much you can do except ask the author to add some exception handling, return codes, etc.
try:
#Doing something proprietary ...
#catch the function call output
result = yourCFuncCall()
#raise an exception if it's not what you expected
if result is None:
raise ValueError('Unexpected Result')
except KeyboardInterupt:
print('Must be a CTRL+C')
return
except:
print('Unhandled Exception')
raise
how about atexit?
http://docs.python.org/library/atexit.html#module-atexit
What is the workflow of processing a signal in python ? I set a signal handler, when the signal occur ,how does python invoke my function? Does the OS invoke it just like C program?
If I am in a C extend of python ,is it interrupted immediately ?
Now it's clear to me how does python process handle a signal . When you set a signal by the signal module , the module will register a function signal_handler(see $src/Modules/signalmodule.c) ,which set your handler and flag it as 1(Handlers[sig_num].tripped = 1;) , then call Py_AddPendingCall to tell python interpreter. The python interpreter will invoke Py_MakePendingCalls to call PyErr_CheckSignals which calls your function in main loop(see $src/Python/ceval.c).
communicate me if you want to talk about this : renenglish#gmail.com
If you set a Python code signal handler using the signal module the interpreter will only run it when it re-enters the byte-code interpreter. The handler is not run right away. It is placed in a queue when the signal occurs. If the code path is currently in C code, built-in or extension module, the handler is deferred until the C code returns control to the Python byte code interpreter. This can be a long time, and you can't really predict how long.
Most notably if you are using interactive mode with readline enabled your signal handler won't run until you give it some input to interpret. this is because the input code is in the readline library (C code) and doesn't return to the interpreter until it has a complete line.
Take a look at the signal module. If you invoke a signal to a python script, from my understanding if there is a handler for it will first process that signal, and potentially has the ability to handle and ignore certain signals. ie. instead of killing on a SIGKILL, you attempt to perform some shutdown cleanup work before killing.
This question already has answers here:
What is the correct way to make my PyQt application quit when killed from the console (Ctrl-C)?
(9 answers)
Closed 9 years ago.
Why doesn't Ctrl+C work to break a Python program that uses PyQt? I want to debug it and get a stack trace and for some reason, this is harder to do than with C++!
CTRL+C causes a signal to be sent to
the process. Python catches the
signal, and sets a global variable,
something like CTRL_C_PRESSED = True.
Then, whenever the Python interpreter
gets to execute a new opcode, it sees
the variable set and raises a
KeybordInterrupt.
This means that CTRL+C works only if
the Python interpreter is spinning. If
the interpreter is executing an
extension module written in C that
executes a long-running operation,
CTRL+C won't interrupt it, unless it
explicitly "cooperates" with Python.
Eg: time.sleep() is theoretically a
blocking operation, but the
implementation of that function
"cooperates" with the Python
interpreter to make CTRL+C work.
This is all by design: CTRL+C is meant
to do a "clean abort"; this is why it
gets turned into an exception by
Python (so that the cleanups are
executed during stack unwind), and its
support by extension modules is sort
of "opt-in". If you want to totally
abort the process, without giving it a
chance to cleanup, you can use CTRL+.
When Python calls QApplication::exec()
(the C++ function), Qt doesn't know
how to "cooperate" with Python for
CTRL+C, and this is why it does not
work. I don't think there's a good way
to "make it work"; you may want to see
if you can handle it through a global
event filter.
— Giovanni Bajo
Adding this to the main program solved the problem.
import signal
signal.signal(signal.SIGINT, signal.SIG_DFL)
I'm not sure what this has to do with the explanation.
I agree with Neil G, and would add this:
If you do not call QApplication.exec_() to start the event loop, and instead execute your program in an interactive python shell (using python -i), then pyqt will automatically process events whenever the interactive prompt is waiting, and Ctrl-C should again behave as expected. This is because the Qt event loop will be sharing time with the python interpreter, rather than running exclusively, allowing the interpreter a chance to catch those interrupts.