Find error in Python script running inside Screen - python

I have written a python script that runs infinite using loops. The script is started inside a screen session. However sometimes, after a few hours or even days, it breaks down for a reason i dont now, because screen session closes when that happends.
I have also created a "watchdog" script with the following code, which also runs inside a screen session:
from subprocess import check_output
import os
import time
import random
time.sleep(20)
def screen_present(name):
try:
var = check_output(["screen -ls; true"],shell=True)
if "."+name+"\t(" in var:
print name+" is running"
else:
print name+" is not running"
print "RESTARTING"
os.system("screen -dmS player python /var/www/updater.py > /dev/null 2> /dev/null & echo $")
except:
return true
while True:
screen_present("updater")
time.sleep(random.uniform(6, 10))
So when i check my scripts after leaving them running a night or so, i sometimes find, that
the screen session with the original code is not there anymore,
because my script must have thrown an exception but i can't find out
which to fix it
the screen session of my watchdog is marked as "dead"
What would you guys do to find the error and guarantee a stable running?

When you start your python process, make it output to a file. Like this:
python myfile.py >> log.txt 2>&1
You will be able to access that file even after it dies.

Related

How to restart a Python script?

In a program I am writing in python I need to completely restart the program if a variable becomes true, looking for a while I found this command:
while True:
if reboot == True:
os.execv(sys.argv[0], sys.argv)
When executed it returns the error [Errno 8] Exec format error. I searched for further documentation on os.execv, but didn't find anything relevant, so my question is if anyone knows what I did wrong or knows a better way to restart a script (by restarting I mean completely re-running the script, as if it were been opened for the first time, so with all unassigned variables and no thread running).
There are multiple ways to achieve the same thing. Start by modifying the program to exit whenever the flag turns True. Then there are various options, each one with its advantages and disadvantages.
Wrap it using a bash script.
The script should handle exits and restart your program. A really basic version could be:
#!/bin/bash
while :
do
python program.py
sleep 1
done
Start the program as a sub-process of another program.
Start by wrapping your program's code to a function. Then your __main__ could look like this:
def program():
### Here is the code of your program
...
while True:
from multiprocessing import Process
process = Process(target=program)
process.start()
process.join()
print("Restarting...")
This code is relatively basic, and it requires error handling to be implemented.
Use a process manager
There are a lot of tools available that can monitor the process, run multiple processes in parallel and automatically restart stopped processes. It's worth having a look at PM2 or similar.
IMHO the third option (process manager) looks like the safest approach. The other approaches will have edge cases and require implementation from your side to handle edge cases.
This has worked for me. Please add the shebang at the top of your code and os.execv() as shown below
#!/usr/bin/env python3
import os
import sys
if __name__ == '__main__':
while True:
reboot = input('Enter:')
if reboot == '1':
sys.stdout.flush()
os.execv(sys.executable, [sys.executable, __file__] + [sys.argv[0]])
else:
print('OLD')
I got the same "Exec Format Error", and I believe it is basically the same error you get when you simply type a python script name at the command prompt and expect it to execute. On linux it won't work because a path is required, and the execv method is basically encountering the same error.
You could add the pathname of your python compiler, and that error goes away, except that the name of your script then becomes a parameter and must be added to the argv list. To avoid that, make your script independently executable by adding "#!/usr/bin/python3" to the top of the script AND chmod 755.
This works for me:
#!/usr/bin/python3
# this script is called foo.py
import os
import sys
import time
if (len(sys.argv) >= 2):
Arg1 = int(sys.argv[1])
else:
sys.argv.append(None)
Arg1 = 1
print(f"Arg1: {Arg1}")
sys.argv[1] = str(Arg1 + 1)
time.sleep(3)
os.execv("./foo.py", sys.argv)
Output:
Arg1: 1
Arg1: 2
Arg1: 3
.
.
.

Terminate a python program running from another program

How to use subprocess to terminate a program which is started at boot?
I ran across this question and found wordsforthewise's answer and tried it but nothing happens.
Wordsforthewise' Answer:
import subprocess as sp
extProc = sp.Popen(['python','myPyScript.py']) # runs myPyScript.py
status = sp.Popen.poll(extProc) # status should be 'None'
sp.Popen.terminate(extProc) # closes the process
status = sp.Popen.poll(extProc) # status should now be something other than 'None' ('1' in my testing)
I have a program /home/pi/Desktop/startUpPrograms/facedetection.py running at boot by a Cronjob and I want to kill it from a flask app route like this.
Assigning program name to extProc = program_name would work? If yes how to assign it?
#app.route("/killFD", methods=['GET', 'POST'])
def killFaceDetector():
#kill code goes here.
Since you say the program is run by cronjob, you will have no handle to the program's PID in Python.
You'll have to iterate over all processes to find the one(s) to kill... or more succinctly, just use the pkill utility, with the -f flag to have it look at the full command line. The following will kill all processes (if your user has the permission to do so) that have facedetection.py in the command line.
import os
os.system('pkill -f facedetection.py')

Is on Python 3 any library to relaunch the script?

I have some script in Python, which does some work. I want to re-run this script automatically. Also, I want to relaunch it on any crashes/freezes.
I can do something like this:
while True:
try:
main()
except Exception:
os.execv(sys.executable, ['python'] + sys.argv)
But, for unknown reason, this still crashes or freezes one time in few days. So I see crash, write "Python main.py" in cmd and it started, so I don't know why os.execv don't do this work by self. I guess it's because this code is part of this app. So, I prefer some script/app, which will control relaunch in external way. I hope it will be more stable.
So this script should work in this way:
Start any script
Check that process of this script is working, for example check some file time change and control it by process name|ID|etc.
When it dissapears from process list, launch it again
When file changed more than 5 minutes ago, stop process, wait few sec, launch it again.
In general: be cross-platform (Linux/Windows)
not important log all crashes.
I can do this by self (right now working on it), but I'm pretty sure something like this must already be done by somebody, I just can't find it in Google\Github.
UPDATE: added code from the #hansaplast answer to GitHub. Also added some changes to it: relauncher. Feel free to copy/use it.
As it needs to work both in windows and on linux I don't know a way to do that with standard tools, so here's a DIY solution:
from subprocess import Popen
import os
import time
# change into scripts directory
abspath = os.path.abspath(__file__)
dname = os.path.dirname(abspath)
os.chdir(dname)
while True:
p = Popen(['python', 'my_script.py', 'arg1', 'arg2'])
time.sleep(20) # give the program some time to write into logfile
while True:
if p.poll() != None:
print('crashed or regularly terminated')
break
file_age_in_s = time.time() - os.path.getmtime('output.log')
if file_age_in_s > 60:
print('frozen, killing process')
p.kill()
break
time.sleep(1)
print('restarting..')
Explanation:
time.sleep(20): give script 20 seconds to write into the log file
poll(): regularly check if script died (either crashed or regularly terminated, you can check the return value of poll() to differentiate that)
getmtime(): regularly check output.log and check if that was changed the past 60 seconds
time.sleep(1): between every check wait for 1s as otherwise it would eat up too many system resources
The script assumes that the check-script and the run-script are in the same directory. If that is not the case, change the lines beneath "change into scripts directory"
I personally like supervisor daemon, but it has two issues here:
It is only for unix systems
It restarts app only on crashes, not freezes.
But it has simple XML-RPC API, so It makes your job to write an freeze-watchdog app simplier. You could just start your process under supervisor and restart it via supervisor API when you see it freezes.
You could install it via apt install supervisor on ubuntu and write config like this:
[program:main]
user=vladimir
command=python3 /var/local/main/main.py
process_name=%(program_name)s
directory=/var/local/main
autostart=true
autorestart=true

In python 2.7 using CGI, how to check whether forked process completed

So I have a set of python scripts. In an attempt to make a simple GUI, I have been combining html and CGI. So far so good. However, one of my scripts takes a long time to complete (>2 hours). So obviously, when I run this on my server (localhost on mac) I get a "gateway timeout error". I was reading about forking the sub process, and checking whether the process completed.
This is what I came up with, but it isn't working :(.
import os, time
#upstream stuff happening as part of main script
pid=os.fork()
if pid==0:
os.system("external_program") #this program will make "text.txt" as output
exit()
while os.stat(""text.txt").st_size == 0: # check whether text.txt has been written, if not print "processing" and continue to wait
print "processing"
sys.stdout.flush()
time.sleep(300)
#downstream stuff happening
As alwas, any help is appreciated
Did you try this one:
import os
processing = len(os.popen('ps aux | grep yourscript.py').readlines()) > 2
It tells you if your script is still running (returns boolean value).

Executing 2 blocking processes and need the control back in the same ssh session

I'm trying to automate the start up procedure for a Linux Agent machine. I need to run two executable files located in two different directories and get the control back to the prompt so as to proceed with some other task like do a grep to check if the two processes are still running. Here are two different ways using Python in which I tried to do this:
Code snippet 1:(By forking a child process)
import os
import pdb
def child():
cwd = os.getcwd()
os.chdir("THoT")
os.chdir("TH_Node")
print "Executing TH_Node.exe........."
command = "mono TH_Node.exe"
os.system(command)
os._exit(0)
def parent():
i = 0
while i < 1:
i = i + 1
newpid = os.fork()
if newpid == 0:
child()
else:
cwd = os.getcwd()
os.chdir("THoT")
os.chdir("TH_Protocol")
print "Executing TH_Protocol.exe........."
command1 = "mono TH_Protocol.exe"
os.system(command1)
parent()
Code snippet 2:(Using multiprocessing)
import multiprocessing
import time
import sys
import os
import pdb
def TH_Protocol():
os.chdir("THoT")
os.chdir("TH_Protocol")
command = "mono TH_Protocol.exe"
os.system(command)
def TH_Node():
os.chdir("THoT")
os.chdir("TH_Node")
command1 = "mono TH_Node.exe"
os.system(command1)
if __name__ == '__main__':
d = multiprocessing.Process(name='TH_Protocol', target=TH_Protocol)
d.TH_Protocol = True
n = multiprocessing.Process(name='TH_Node', target=TH_Node)
n.TH_Protocol = False
d.start()
n.start()
d.join(1)
n.join()
The problem is although I get both the processes TH_Protocol.exe and TH_Node.exe to run, I need to ssh to another session to run a grep command to check if the two processes are running. I need to get the control back in the same session as the session in which I run my python script. I tried to use the subprocess.Popen as well, but I face the same problem. Is there any way I can solve this issue?
If you just want to run this script in the background, and get control of your ssh session back while it's running… that has nothing to do with Python, or ssh, it's basic shell job control.
For example, assuming your shell on the remote machine is sh/bash/similar, instead of this:
remote_machine$ python script.py
… do this:
remote_machine$ python script.py &
[1] 55341
remote_machine$
Now you've got the prompt back. You can interact with the main interpreter process as %1 or PID 55341. After it finally finishes, the next prompt you get will show something like this:
[1]+ Done python
You can't directly interact with the two child processes this way. You can always grep for them if you want, or search for child processes of PID 55341… but you might find your life easier if you had the child processes do something like print('TH_Protocol on {}'.format(os.getpid())) as soon as they start up, so you don't have to do that.

Categories