Graceful Handling of Segfault - python

I'm writing a program in Python that uses a closed source API in Linux. The API sometimes works, and sometimes segfaults - crashing my program also. However, if the program runs for 10 seconds, its past the point where it has a chance of segfaulting and runs forever (errors only happen in the beginning).
I think I need some type of script that:
starts my python program,
waits 10 seconds,
checks if python is still running
if it is running, the script should end itself without ending python
if python is NOT running, then repeat.
Is such a program possible? Will a segfault kill the script also?

Yes, such a program is perfectly possible. You just have to run these two programs in separate processes - SEGFAULT only kills the process in which it has occured.
If you are under Linux, you can use either bash or python if you want. Just start the script that is failling in separate process. Code in python could look similar to this:
import subprocess
import time
start = time.clock()
ret = subprocess.call(['myprog', 'myarg0', ...])
end = time.clock()
if end - start > threshold:
restart()
Also, maybe a return code from such process has some meaningful value when it has finished because of SEGFAULT.

Can you isolate the calls to this buggy API inside a child process? That way you can check the exit status and handle crashes within a Try ... Catch

Related

How to Raise KeyboardInterrupt to Python Scripts After Interrupting GNU Parallel?

I'm using GNU Parallel to run a Python script for a list of different arguments. Inside the Python script, I'm writing data to a file (in fact, the name of the file is the script argument). The Python script writes the data to the file after processing N trials, where N is another argument. Consequently, the data does not get written until all trials are finished. But the time to go through a trial can vary depending on a number of test arguments. For this reason, should the script take too long for a certain set of arguments, the script allows me to raise a KeyboardInterrupt error (Ctrl+C) and write the data it has obtained so far before terminating.
However, by using GNU Parallel, using Ctrl+C will kill the parallel command, and completely stop the Python jobs, hence no data-so-far being written.
Is it possible to raise KeyboardInterrupt in these Python scripts to have them finish handling the error before parallel is killed? Ideally, it would go something like 1. Execute parallel python script.py ::: args, 2. After an amount of time, cancel using Ctrl+C, 3. Parallel tells Python scripts to see a KeyboardInterrupt (or any error, it doesn't matter) and Parallel pauses to wait for Python jobs to finish handling, 4. Parallel terminates, 5. I have files with the data obtained in that time.
Note: I would like an answer that doesn't ask to rewrite the Python script's data writing method.
I believe you are looking for --termseq. myprog.pl:
#!/usr/bin/perl
$SIG{'TERM'} = sub { print "TERM received. Flush files.\n"; sleep(1); };
sleep(100);
Now run:
parallel --termseq TERM,2000,KILL,20 -u ./myprog.pl ::: 1 2 3
When GNU Parallel receives ctrl-c it will send SIGTERM to the child, wait 2000 ms and if the child is still alive kill the child.
Wait a few seconds and press ctrl-c
If you are absolutely sure the Python program will exit after receiving the SIGTERM then you can remove ,KILL,20. It is just a fall back if the Python program is stuck for some reason.

GDB not stopping with "interrupt" command from python script

I've been ripping my hair out over this. I've searched the internet and can't seem to find a solution to my problem. I'm trying to auto test some code using the gdb module from python. I can do basic command and things are working except for stopping a process that's running in the background. Currently I continue my program in the background after a break point with this:
gdb.execute("c&")
I then interact with the running program reading different constant values and getting responses from the program.
Next I need to get a chunk of memory so I run these commands:
gdb.execute("interrupt") #Pause execution
gdb.execute("dump binary memory montiormem.bin 0x0 (&__etext + 4)") #dump memory to file
But when I run the memory dump I get an error saying the command can't be run when the target is running, after the error the interrupt command is run and the target is paused, then from the gdb console window I can run the memory dump.
I found a similar issue from awhile ago that seems to not be answered here.
I'm using python2.7.
I also found this link which seems to be the issue but no indication if it's in my build of gdb (which seems unlikely).
I had the same problem, from what I can tell from googling it is a current limitation of gdb: interrupt simply doesn't work in batch mode (when specifying commands with --ex, or -x file, or on stdin, or sourcing from file), it runs the following commands before actually stopping the execution (inserting a delay doesn't help). Building on the #dwjbosman's solution, here's a compact version suitable for feeding to gdb with --ex arguments for example:
python import threading, gdb
python threading.Timer(1.0, lambda: gdb.post_event(lambda: gdb.execute("interrupt"))).start()
cont
thread apply all bt full # or whatever you wanted to do
It schedules an interrupt after 1 second and resumes the program, then you can do whatever you wanted to do after the pause right in the main script.
I had the same problem, but found that none of the other answers here really work if you are trying to script everything from python. The issue that I ran into was that when I called gdb.execute('continue'), no code in any other python thread would execute. This appears to be because gdb does not release the python GIL while the continue command is waiting for the program to be interrupted.
What I found that actually worked for me was this:
def delayed_interrupt():
time.sleep(1)
gdb.execute('interrupt')
gdb.post_event(delayed_interrupt)
gdb.execute('continue')
I just ran into this same issue while writing some automated testing scripts. What I've noticed is that the 'interrupt' command doesn't stop the application until after the current script has exited.
Unfortunately, this means that you would need to segment your scripts anytime you are causing an interrupt.
Script 1:
gdb.execute('c&')
gdb.execute('interrupt')
Script 2:
gdb.execute("dump binary memory montiormem.bin 0x0 (&__etext + 4)")
I used multi threading to get arround this issue:
def post(cmd):
def _callable():
print("exec " + cmd , flush=True)
gdb.execute(cmd)
print("schedule " + cmd , flush=True)
gdb.post_event(_callable)
class ScriptThread (threading.Thread):
def run (self):
while True:
post("echo hello\n")
time.sleep(1)
x = ScriptThread()
x.start()
Save this as "test_script.py"
Use the script as follows:
gdb
> source test_script.py
Note: that you can also pipe "source test_script.py", but you need to keep the pipe open.
Once the thread is started GDB will wait for the thread to end and will process any commands you send to it via the "post_event" function. Even "interrupt"!

How to run a python process in the background continuosly

I'm trying to build a todo manager in python where I want to continuously run a process in the bg that will alert the user with a popup when the specified time comes. I'm wondering how I can achieve that.
I've looked at some of the answers on StackOverflow and on other sites but none of them really helped.
So, What I want to achieve is to start a bg process once the user enters a task and keep on running it in the background until the time comes. At the same time there might be other threads running for other tasks as well that will end at their end times.
So far, I've tried this:
t = Thread(target=bg_runner, kwargs={'task': task, 'lock_file': lock_file_path})
t.setName("Get Done " + task.
t.start()
t.join()
With this the thread is continuosly running but it runs in the foreground and only exits when the execution is done.
If I add t.daemon = True in the above code, the main thread immediately exits after start() and it looks like the daemon is also getting killed then.
Please let me know how this can be solved.
I'm guessing that you just don't want to see the terminal window after you launch the script. In this case, it is a matter of how you execute the script.
Try these things.
If you are using a windows computer you can try using pythonw.exe:
pythonw.exe example_script.py
If you are using linux (maybe OSx) you may want to use 'nohup' in the terminal.
nohup python example_script.py
More or less the reason you have to do this comes down to how the Operating system handles processes. I am not an expert on this subject matter, but generally if you launch a script from a terminal, that script becomes a child process of the terminal. So if you exit that terminal, it will also terminate any child processes. The only way to get around that is to either detach the process from the terminal with something like nohup.
Now if you end up adding the #!/usr/bin/env python shebang line, your os could possibly just run the script without a terminal window if you just double click the script. YMMV (Again depends on how your OS works)
The first thing you need to do is prevent your script from exiting by adding a while loop in the main thread:
import time
from threading import Thread
t = Thread(target=bg_runner, kwargs={'task': task, 'lock_file': lock_file_path})
t.setName("Get Done " + task)
t.start()
t.join()
while True:
time.sleep(1.0)
Then you need to put it in the background:
$ nohup python alert_popup.py >> /dev/null 2>&1 &
You can get more information on controlling a background process at this answer.

How can I start and stop a Python script from shell

thanks for helping!
I want to start and stop a Python script from a shell script. The start works fine, but I want to stop / terminate the Python script after 10 seconds. (it's a counter that keeps counting). bud is won't stop.... I think it is hanging on the first line.
What is the right way to start wait for 10 seconds en stop?
Shell script:
python /home/pi/count1.py
sleep 10
kill /home/pi/count1.py
It's not working yet. I get the point of doing the script on the background. That's working!. But I get another comment form my raspberry after doing:
python /home/pi/count1.py &
sleep 10; kill /home/pi/count1.py
/home/pi/sebastiaan.sh: line 19: kill: /home/pi/count1.py: arguments must be process or job IDs
It's got to be in the: (but what? Thanks for helping out!)
sleep 10; kill /home/pi/count1.py
You're right, the shell script "hangs" on the first line until the python script finishes. If it doesn't, the shell script won't continue. Therefore you have to use & at the end of the shell command to run it in the background. This way, the python script starts and the shell script continues.
The kill command doesn't take a path, it takes a process id. After all, you might run the same program several times, and then try to kill the first, or last one.
The bash shell supports the $! variable, which is the pid of the last background process.
Your current example script is wrong, because it doesn't run the python job and the sleep job in parallel. Without adornment, the script will wait for the python job to finish, then sleep 10 seconds, then kill.
What you probably want is something like:
python myscript.py & # <-- Note '&' to run in background
LASTPID=$! # Save $! in case you do other background-y stuff
sleep 10; kill $LASTPID # Sleep then kill to set timeout.
You can terminate any process from any other if OS let you do it. I.e. if it isn't some critical process belonging to the OS itself.
The command kill uses PID to kill the process, not the process's name or command.
Use pkill for that.
You can also, send it a different signal instead of SIGTERM (request to terminate a program) that you may wish to detect inside your Python application and respond to it.
For instance you may wish to check if the process is alive and get some data from it.
To do this, choose one of the users custom signals and register them within your Python program using signal module.
To see why your script hangs, see Austin's answer.

Constantly monitor a program/process using Python

I am trying to constantly monitor a process which is basically a Python program. If the program stops, then I have to start the program again. I am using another Python program to do so.
For example, say I have to constantly run a process called run_constantly.py. I initially run this program manually, which writes its process ID to the file "PID" (in the location out/PROCESSID/PID).
Now I run another program which has the following code to monitor the program run_constantly.py from a Linux environment:
def Monitor_Periodic_Process():
TIMER_RUNIN = 1800
foo = imp.load_source("Run_Module","run_constantly.py")
PROGRAM_TO_MONITOR = ['run_constantly.py','out/PROCESSID/PID']
while(1):
# call the function checkPID to see if the program is running or not
res = checkPID(PROGRAM_TO_MONITOR)
# if res is 0 then program is not running so schedule it
if (res == 0):
date_time = datetime.now()
scheduler.add_cron_job(foo.Run_Module, year=date_time.year, day=date_time.day, month=date_time.month, hour=date_time.hour, minute=date_time.minute+2)
scheduler.start()
scheduler.get_jobs()
time.sleep(TIMER_NOT_RUNIN)
continue
else:
#the process is running sleep and then monitor again
time.sleep(TIMER_RUNIN)
continue
I have not included the checkPID() function here. checkPID() basically checks if the process ID still exists (i.e. if the program is still running) and if it does not exist, it returns 0. In the above program, I check if res == 0, and if so, then I use Python's scheduler to schedule the program. However, the major problem that I am currently facing is that the process ID of this program and the run_constantly.py program turns to be same once I schedule the run_constantly.py using the scheduler.add_cron_job() function. So if the program run_constantly.py crashes, the following program still thinks that the run_constantly.py is running (since both process IDs are same), and therefore continues to go into the else loop to sleep and monitor again.
Can someone tell me how to solve this issue? Is there a simple way to constantly monitor a program and reschedule it when it has crashed?
There are many programs that can do this.
On Ubuntu there is upstart (installed by default)
Lots of people like http://supervisord.org/
monit as mentioned by #nathan
If you are looking for a python alternative there is a library that has just been released called circus which looks interesting.
And pretty much every linux distro probably has one of these built in.
The choice is really just down to which one you like better, but you would be far better off using one of these than writing it yourself.
Hope that helps
If you are willing to control the monitored program directly from python instead of using cron, have a look at the subprocess module :
The subprocess module allows you to spawn new processes,
connect to their input/output/error pipes, and obtain their return codes.
Check examples like track process status with python on SO for examples and references.
You could just use monit
http://mmonit.com/monit/
It monitors processes and restarts them (and other things.)
I thought I'd add a more versatile solution, which is one that I personally use all the time as well.
It's name is Immortal (source is at https://github.com/immortal/immortal)
To have it monitor and instantly restart a program if it stops, simply run the following command:
immortal <command>
So in your case I would run run_constantly.py like so:
immortal python run_constantly.py
The command ps aux | grep run_constantly.py should return 2 process IDs, one for the Immortal command, and one for the separate command Immortal started (just the regular command. As long as the Immortal process is running, run_constantly.py will stay running.

Categories