mpiexec + python + ^C: __del__ method not executed (and no traceback) - python

I have the following test_mpi.py python script:
from mpi4py import MPI
import time
class Foo:
def __init__(self):
print('Creation object.')
def __del__(self):
print('Object destruction.')
foo = Foo()
time.sleep(10)
If I execute it without recourse to mpiexec, using a simple python test_mpi.py, pressing CTRL+C after 5s, I get the following output:
ngreiner#Nathans-MacBook-Pro:~/Documents/scratch$ python test_mpi.py
Creation object.
^CTraceback (most recent call last):
File "test_mpi.py", line 26, in <module>
time.sleep(10)
KeyboardInterrupt
Object destruction.
ngreiner#Nathans-MacBook-Pro:~/Documents/scratch$
If I embed it within an mpiexec execution, using mpiexec -np 1 python test_mpi.py, again pressing CTRL+C after 5s, I now get:
ngreiner#Nathans-MacBook-Pro:~/Documents/scratch$ mpiexec -np 1 python test_mpi.py
Creation object.
^Cngreiner#Nathans-MacBook-Pro:~/Documents/scratch$
The traceback from python and the execution of the __del__ method have disappeared.
The main problem for me is the non-execution of the __del__ method, which is supposed to make some clean-up in my actual application.
Any idea how I could have the __del__ method executed when the Python execution is launched from mpiexec ?
Thank you very much in advance for the help,
(My system configuration: macOS High sierra 10.13.6, Python 3.7.4, open-mpi 4.0.1, mpi4py 3.0.2.)

After a bit of search, I found a solution to restore the printing of the traceback and the execution of the __del__ method when hitting ^C during mpiexec.
During a normal python execution (not launched by mpiexec, launched directly from the terminal), hitting ^C sends a SIGINT signal to python, which translates it into a KeyboardInterrupt exception (https://docs.python.org/3.7/library/signal.html).
But when hitting ^C during an mpiexec execution, it is the mpiexec process which receives the SIGINT signal, and instead of propagating it to its children processes (for instance python), it sends to its children processes a SIGTERM signal (https://www.open-mpi.org/doc/current/man1/mpirun.1.php).
It thus seems that python doesn't react similarly to SIGINT and SIGTERM signals.
The workaround I found is to use the signal module, and to use a specific handler for the SIGTERM signal, which simply raises a KeyboardInterrupt. This can be achieved by the following lines:
def sigterm_handler():
raise KeyboardInterrupt
import signal
signal.signal(signal.SIGTERM, sigterm_handler)
The former can be included at the top of the executed python script, or, to retain this behaviour each time python is used with mpiexec and with the mpi4py package, at the top of the __init__.py file of the mpi4py package.
This strategy may have side-effects (which I am unaware of) and should be used at your own risk.

Per documentation, it is not guaranteed that del would be called. So you are lucky that it is called on non-mpi program.
For simple case, you could use try/finally to be sure that finally section is executed.
Or, more generically, use context manager
Here is a quote from documentation that is important here:
It is not guaranteed that del() methods are called for objects that still exist when the interpreter exits.

The answer by ngreiner helped me, but at least with Python 2.7 and all Python 3 versions, the handler function needs two arguments. This modified code snippet with dummy arguments worked for me:
import signal
def sigterm_handler(signum, frame):
raise KeyboardInterrupt
signal.signal(signal.SIGTERM, sigterm_handler)

Related

How do I debug functions registered by `atexit` using `pdb` in Python?

Main question
My understanding is that atexit will run registered functions in reverse order as soon as the program is killed by a signal handled by Python. One example of such a signal is when the interpreter is quit. So if I have a file called register-arguments.py as follows:
def first_registered():
print('first registered')
def second_registered():
print('second registered')
import atexit
atexit.register(first_registered)
atexit.register(second_registered)
Running python register-arguments.py will trigger the following steps:
Interpreter is started
Functions are registered, which generates no output in the terminal
The interpreter is terminated, and the registered functions are called
in reverse order.
The output is as follows:
second registered
first registered
Which makes sense. However, if I try to debug one of these functions with Python's native debugger pdb, here's what I get:
$ python -m pdb .\register-arguments.py
> ...\atexit-notes\register-arguments.py(1)<module>()
-> def first_registered():
(Pdb) b 2
Breakpoint 1 at c:\users\amine.aboufirass\desktop\temp\atexit-notes\register-arguments.py:2
(Pdb) c
As you can see the program starts and stops at line 1 of the source file. I then intentionally place a breakpoint at line 2, then use the c command to continue execution.
I expected the debugger to stop after the registered function is called (upon pdb exiting the terminal). However, that doesn't happen, and I get the following message:
The program finished and will be restarted.
How do I debug functions registered by atexit using pdb in Python?
On pdb.set_trace
It seems that explicitly adding a trace inside the source code does work:
import pdb
def first_registered(arg):
pdb.set_trace()
print(f'first registered, arg={arg}')
def second_registered():
print('second registered')
import atexit
atexit.register(first_registered, arg="test")
atexit.register(second_registered)
Debugging with python register-arguments.py will land you in the line where the breakpoint is explicitly added, and the value of the variable arg at that breakpoint is indeed 'test'.
I have a strong preference for not editing the source code, so this unfortunately won't work for me. I'm curious to see whether there's a way to do it without editing the source code.

Python script can't be terminated through Ctrl+C or Ctrl+Break

I have this simple python script called myMain.py to execute another python program automatically with incremental number, and I'm running it on CentOS 7:
#!/usr/bin/python
import os
import sys
import time
def main():
step_indicator = ""
arrow = ">"
step = 2
try:
for i in range(0,360, step):
step_percentage = float(i)/360.0 * 100
if i % 10 == 0:
step_indicator += "="
os.system("python myParsePDB.py -i BP1.pdb -c 1 -s %s" % step)
print("step_percentage%s%s%.2f" % (step_indicator,arrow,step_percentage)+"%")
except KeyboardInterrupt:
print("Stop me!")
sys.exit(0)
if __name__ == "__main__":
main()
For now I only know this script is single thread safe, but I can't terminate it with Ctrl+C keyboard interruption.
I have read some relative questions: such as Cannot kill Python script with Ctrl-C and Stopping python using ctrl+c I realized that Ctrl+Z does not kill the process, it only pauses the process and keep the process in background. Ctrl+Break does work for my case either, I think it only terminates my main thread but keeps the child process.
I also noticed that calling os.system() will spawn a child process from the current executing process. At the same time, I also have os file I/O functions and os.system("rm -rf legacy/*") will be invoked in myParsePDB.py which means this myParsePDB.py child process will spawn child process as well. Then, if I want to catch Ctrl+C in myMain.py, should I daemon only myMain.py or should I daemon each process when they spawn?
This is a general problem that could raise when dealing with signal handling. Python signal is not an exception, it's a wrapper of operating system signal. Therefore, signal processing in python depends on operating system, hardware and many conditions. However, how to deal with these problem is similar.
According to this tutorial, I'll quote the following paragraphs: signal – Receive notification of asynchronous system events
Signals are an operating system feature that provide a means of
notifying your program of an event, and having it handled
asynchronously. They can be generated by the system itself, or sent
from one process to another. Since signals interrupt the regular flow
of your program, it is possible that some operations (especially I/O)
may produce error if a signal is received in the middle.
Signals are identified by integers and are defined in the operating
system C headers. Python exposes the signals appropriate for the
platform as symbols in the signal module. For the examples below, I
will use SIGINT and SIGUSR1. Both are typically defined for all Unix
and Unix-like systems.
In my code:
os.system("python myParsePDB.py -i BP1.pdb -c 1 -s %s" % step) inside the for loop will be executed for a bit of time and will spend some time on I/O files. If the keyboard interrupt is passing too fast and do not catch asynchronously after writing files, the signal might be blocked in operating system, so my execution will still remain the try clause for loop. (Errors detected during execution are called exceptions and are not unconditionally fatal: Python Errors and Exceptions).
Therefore the simplest way to make them asynchonous is wait:
try:
for i in range(0,360, step):
os.system("python myParsePDB.py -i BP1.pdb -c 1 -s %s" % step)
time.sleep(0.2)
except KeyboardInterrupt:
print("Stop me!")
sys.exit(0)
It might hurt performance but it guaranteed that the signal can be caught after waiting the execution of os.system(). You might also want to use other sync/async functions to solve the problem if better performance is required.
For more unix signal reference, please also look at: Linux Signal Manpage

Singleton send signal to actual running process

I've developed a program in Python and pyGtk and today added the singleton feature, which doesn't allow to run it if it is already running. But now I want to go further and, if its running, somehow make it call self.window.present() to showit.
So I've been looking at Signals, PIPE, FIFO, MQ, Socket, etc. for three hours now! I don't know if I'm just not seeing it or what, but can't find the way to do this (even when lots of apps do it)
Now, the question would be: How do I send a "signal" to a running instance of the same script (which is not in an infinite bucle listening for it, but doing it's job), to make it call a function?
I'm trying sending signals, using:
os.kill(int(apid[0]),signal.SIGUSR1)
and receiving them with:
signal.signal(signal.SIGUSR1, self.handler)
def handler(signum, frame):
print 'Signal handler called with signal', signum
but it kills the running process with
Traceback (most recent call last):
File "./algest_new.py", line 4080, in <module>
gtk.main()
KeyboardInterrupt
The simple answer is, you don't. When you say you have implemented a "singleton feature" I'm not sure exactly what you mean. It seems almost as though you are expecting the code in the second process to be able to see the singleton object in the first one, which clearly isn't possible. But I may have misunderstood.
The usual way to do this is to create a file with a unique name at a known location, typically containing the process id of the running process. If you start your program and it sees the file already present it knows to explain to the user that there's a copy already running. You could also send a signal to that process (under Unix, anyway) to tell it to bring its window to the foreground.
Oh, and don't forget that your program should delete the PIDfile when it terminates :-)
Confusingly, gtk.main will raise the KeyboardInterrupt exception if the signal handler raises any exception. With this program:
import gtk
import signal
def ohno(*args):
raise Exception("Oh no")
signal.signal(signal.SIGUSR1, ohno)
gtk.main()
After launching, calling os.kill(pid, signal.SIGUSR1) from another process results in this exception:
File "signaltest.py", line 9, in <module>
gtk.main()
KeyboardInterrupt
This seems to be an issue with pygtk - an exception raised by a signal.signal handler in a non-gtk python app will do the expected thing and display the handler's exception (e.g. "Oh no").
So in short: if gtk.main is raising KeyboardInterrupt in response to other signals, check that your signal handlers aren't raising exceptions of their own.

Python - Function is unable to run in new thread

I'm trying to kill the notepad.exe process on windows using this function:
import thread, wmi, os
print 'CMD: Kill command called'
def kill():
c = wmi.WMI ()
Commands=['notepad.exe']
if Commands[0]!='All':
print 'CMD: Killing: ',Commands[0]
for process in c.Win32_Process ():
if process.Name==Commands[0]:
process.Terminate()
else:
print 'CMD: trying to kill all processes'
for process in c.Win32_Process ():
if process.executablepath!=inspect.getfile(inspect.currentframe()):
try:
process.Terminate()
except:
print 'CMD: Unable to kill: ',proc.name
kill() #Works
thread.start_new_thread( kill, () ) #Not working
It works like a charm when I'm calling the function like this:
kill()
But when running the function in a new thread it crashes and I have no idea why.
import thread, wmi, os
import pythoncom
print 'CMD: Kill command called'
def kill():
pythoncom.CoInitialize()
. . .
Running Windows functions in threads can be tricky since it often involves COM objects. Using pythoncom.CoInitialize() usually allows you do it. Also, you may want to take a look at the threading library. It's much easier to deal with than thread.
There are a couple of problems (EDIT: The second problem has been addressed since starting my answer, by "MikeHunter", so I will skip that):
Firstly, your program ends right after starting the thread, taking the thread with it. I will assume this is not a problem long-term because presumably this is going to be part of something bigger. To get around that, you can simulate something else keeping the program going by just adding a time.sleep() call at the end of the script with, say, 5 seconds as the sleep length.
This will allow the program to give us a useful error, which in your case is:
CMD: Kill command called
Unhandled exception in thread started by <function kill at 0x0223CF30>
Traceback (most recent call last):
File "killnotepad.py", line 4, in kill
c = wmi.WMI ()
File "C:\Python27\lib\site-packages\wmi.py", line 1293, in connect
raise x_wmi_uninitialised_thread ("WMI returned a syntax error: you're probably running inside a thread without first calling pythoncom.CoInitialize[Ex]")
wmi.x_wmi_uninitialised_thread: <x_wmi: WMI returned a syntax error: you're probably running inside a thread without first calling pythoncom.CoInitialize[Ex] (no underlying exception)>
As you can see, this reveals the real problem and leads us to the solution posted by MikeHunter.

Installing signal handler with Python

(there is a follow up to this question here)
I am working on trying to write a Python based Init system for Linux but I'm having an issue getting signals to my Python init script. From the 'man 2 kill' page:
The only signals that can be sent to process ID 1, the init process, are those for which init has explicitly installed signal handlers.
In my Python based Init, I have a test function and a signal handler setup to call that function:
def SigTest(SIG, FRM):
print "Caught SIGHUP!"
signal.signal(signal.SIGHUP, SigTest)
From another TTY (the init script executes sh on another tty) if I send a signal, it is completely ignored and the text is never printed. kill -HUP 1
I found this issue because I wrote a reaping function for my Python init to reap its child processes as they die, but they all just zombied, it took awhile to figure out Python was never getting the SIGCHLD signal. Just to ensure my environment is sane, I wrote a C program to fork and have the child send PID 1 a signal and it did register.
How do I install a signal handler the system will acknowledge if signal.signal(SIG, FUNC) isn't working?
Im going to try using ctypes to register my handler with C code and see if that works, but I rather a pure Python answer if at all possible.
Ideas?
( I'm not a programmer, Im really in over my head here :p )
Test code below...
import os
import sys
import time
import signal
def SigTest(SIG, FRM):
print "SIGINT Caught"
print "forking for ash"
cpid = os.fork()
if cpid == 0:
os.closerange(0, 4)
sys.stdin = open('/dev/tty2', 'r')
sys.stdout = open('/dev/tty2', 'w')
sys.stderr = open('/dev/tty2', 'w')
os.execv('/bin/ash', ('ash',))
print "ash started on tty2"
signal.signal(signal.SIGHUP, SigTest)
while True:
time.sleep(5.0)
Signal handlers mostly work in Python. But there are some problems. One is that your handler won't run until the interpreter re-enters it's bytecode interpreter. if your program is blocked in a C function the signal handler is not called until it returns. You don't show the code where you are waiting. Are you using signal.pause()?
Another is that if you are in a system call you will get an exception after the singal handler returns. You need to wrap all system calls with a retry handler (at least on Linux).
It's interesting that you are writing an init replacement... That's something like a process manager. The proctools code might interest you, since it does handle SIGCHLD.
By the way, this code:
import signal
def SigTest(SIG, FRM):
print "SIGINT Caught"
signal.signal(signal.SIGHUP, SigTest)
while True:
signal.pause()
Does work on my system.

Categories