I am working with a groundwater modeling executable (HYDRUS1D) which I call with a Python script. I want to do some Monte Carlo runs but sometimes the program gets hung up and does not converge for extended periods of time.
Is there a way to give the executable a certain amount of time to run, cancel it if it goes over this time, and then start a new simulation all without interrupting the Python script? The simulation should take no more than 3-5 seconds, so I am hoping to give it a maximum of 10 seconds to finish.
I first run a function that changes some input parameters to the model, then execute Hydrus via the 'run_single_sim' function:
for value in n_variations_21:
for value2 in n_variations_23:
write_hydraulic_params('foo',layers,value,value2)
run_single_sim()
Where run_single_sim() executes Hydrus via os.system:
def run_single_sim():
os.system('./hydrus LEVEL_01.DIR')
I have tried a few solutions involving threading such as this, and this; but it seems like my script gets stuck on the os.system call and therefore cannot check to see how long the thread has been running or kill the thread after sleeping the script for some specified amount of time.
You asked "how to stop an executable called via Python ...", but I feel
this question is simply about "how to stop an executable".
What's interesting is that we have a child that might misbehave.
The parent is uninteresting, could be rust, ruby, random other language.
The timeout issue you pose is a sensible question,
and there's a stock answer for it, in the GNU coreutils package.
Instead of
os.system('./hydrus LEVEL_01.DIR')
you want
os.system('timeout 10 ./hydrus LEVEL_01.DIR')
Here is a quick demo, using a simpler command than hydrus.
$ timeout 2 sleep 1; echo $?
0
$
$ timeout 2 sleep 3; echo $?
124
As an entirely separate matter, prefer check_output()
over the old os.system().
You quoted a pair of answer articles that deal with threading.
But you're spawning a separate child process,
with no shared memory, so threading's not relevant here.
We wish to eventually send a SIGTERM signal to an ill behaved process,
and we hope it obeys the signal by quickly dropping out.
Timing out a child that explicitly ignores such signals would
be a slightly stickier problem.
An uncatchable SIGKILL can be sent
by using the --kill-after=duration flag.
Related
I must ensure that the end_event() function is executed at the end of the program. I tried to implement this as Python's atexit. However, when the .py file was converted to an exe file with a PyInstaller and closed with a click, it did not work. I would appreciate it if you could tell me a solution that always works. Have a good day.
import atexit
import signal
import pyupbit
def end_event():
for keys in buy_list.keys():
order = upbit.get_order(keys)
if "state" in order:
if(order['state'] == 'wait'):
upbit.cancel_order(keys)
exit(1)
atexit.register(end_event)
signal.signal(signal.SIGINT, end_event)
There is no way to guarantee that code will run when the process exits, because there are ways to exit which do not permit any code to run.
User may pull the power cable. No code can run because there is no power.
External events (power outage, lightning strike), same as above.
Kill -9 or equivalent, e.g. Windows' TerminateProcess
Patterns for emergency cleanup.
The best pattern is: Never be in a state which requires emergency cleanup.
There is no guarantee that your cleanup will run because you could have a kill -9 or a power outage.
Therefore you need to be able to recover in that scenario.
See also: What does the POSIX standard say about thread stacks in atexit() handlers? What's the OS practice?
I'm using GNU Parallel to run a Python script for a list of different arguments. Inside the Python script, I'm writing data to a file (in fact, the name of the file is the script argument). The Python script writes the data to the file after processing N trials, where N is another argument. Consequently, the data does not get written until all trials are finished. But the time to go through a trial can vary depending on a number of test arguments. For this reason, should the script take too long for a certain set of arguments, the script allows me to raise a KeyboardInterrupt error (Ctrl+C) and write the data it has obtained so far before terminating.
However, by using GNU Parallel, using Ctrl+C will kill the parallel command, and completely stop the Python jobs, hence no data-so-far being written.
Is it possible to raise KeyboardInterrupt in these Python scripts to have them finish handling the error before parallel is killed? Ideally, it would go something like 1. Execute parallel python script.py ::: args, 2. After an amount of time, cancel using Ctrl+C, 3. Parallel tells Python scripts to see a KeyboardInterrupt (or any error, it doesn't matter) and Parallel pauses to wait for Python jobs to finish handling, 4. Parallel terminates, 5. I have files with the data obtained in that time.
Note: I would like an answer that doesn't ask to rewrite the Python script's data writing method.
I believe you are looking for --termseq. myprog.pl:
#!/usr/bin/perl
$SIG{'TERM'} = sub { print "TERM received. Flush files.\n"; sleep(1); };
sleep(100);
Now run:
parallel --termseq TERM,2000,KILL,20 -u ./myprog.pl ::: 1 2 3
When GNU Parallel receives ctrl-c it will send SIGTERM to the child, wait 2000 ms and if the child is still alive kill the child.
Wait a few seconds and press ctrl-c
If you are absolutely sure the Python program will exit after receiving the SIGTERM then you can remove ,KILL,20. It is just a fall back if the Python program is stuck for some reason.
I wish to schedule a computation to occur after my current computation in Python is finished. Note that my Python interpreter is running through emacs.
For example I am currently running:
>>> for i in range(2, 5):
... tn.TweetNetwork.create_subnetworks(i)
...
I made a simple mistake and meant to type range(1,5). This has been running for at least 4 hours and should run for another few hours. That being said I do not want to re-execute the loop with the correction and lose all that has been computed.
As I am not by the computer 24/7, how can I schedule Python to execute the function `tn.TweetNetwork.create_subnetworks(1)?
I use emacs 24.3 and ubuntu 12.04 LTS, let me know if you need more information. All help is greatly appreciated!
EDIT: I like the answer posted, however I do not know how to find the PID. I am running a Python interpreter through emacs. So how would I find that out?
This was too much for the comment, but this isn't a complete reply.
To get a process started by Emacs:
M-x list-processes,
identify the process you want to get the id of
M-:(process-id (get-process "name-of-the-process")).
But this will give you the process of the interpreter, not any other process started from it.
If you then need to get all processes spawned through that process, you can do:
$ pstree PID
Where PID is the one you obtained earlier from Emacs.
I think, the easiest way is to write another script that wait until your process finished and runs tn.TweetNetwork.create_subnetworks(1). This will work only if your create_subnetworks does not access any global variables and does and write all results into database/file/etc.
# Write script similar to these
import os, time
print "Wait until old script completed..."
while os.path.exists("/proc/SCRIPT_PID"):
time.sleep(1)
print "Execute create_subnetworks..."
tn = ...
tn.TweetNetwork.create_subnetworks(1)
Connect to your computer by SSH, get process id by ps axu | grep script_name and run this new script.
If Tyler comment does not help, you may eval the following piece of code:
(defun foo (ignored)
(remove-hook 'comint-output-filter-functions 'foo)
(run-with-timer 1 nil (lambda()
(goto-char (point-max))
(insert "tn.TweetNetwork.create_subnetworks(1)")
(comint-send-input))))
(add-hook 'comint-output-filter-functions 'foo)
It defines a function that will insert the command you need to insert in the python inferior buffer, a second after the invocation of that function (the delay is for avoid recursive loops).
Then it setup the invocation of that function upon the event where the inferior process (python, in your case) writes anything. In your case, that would be the ">>>" prompt, that python writes when ready. If your code is generating output, this approach won't work.
If you are using comint in other buffers (shell, sql, ...) you would need to make variable comint-output-filter-functions local to your python interactive buffer (with make-variable-buffer-local)
I am interfacing a small MATLAB script with Python via the subprocess module. As follows:
cmd='(matlab -nosplash -nodesktop -r "optimizer;quit;")'
p = subprocess.Popen(cmd,stdin=None,stdout=None,shell=True)
#subprocess.Popen.wait(p)
#p.wait()
print "DONE?"
But "DONE" is being printed even before MATLAB starts! My entire code past it is breaking because of this.
I have tried:
Using os.system() calls (This is where I started, but I read on SO that its deprecated)
Using p.wait() and subprocess.Popen.wait. Both don't work.
Using a manual pause of 3 minutes (Max. time MATLAB takes to finish on average) Super Sloppy.
What am I missing?
Works fine for me:
import subprocess
retcode = subprocess.call(["matlab", "-nosplash", "-nodesktop", "-r", "quit;"])
print "DONE", retcode
Split the command arguments accordingly, use only options that you actually require (no need for shell=True, for example), use the function that directly does what you are after (call), i.e., call and wait for completion.
Depending on your installation (see http://www.mathworks.com/help/matlab/ref/matlabwindows.html), Matlab may be launched in a way such that it immediately quits. To handle that, add "-wait" to your argument list.
Start Matlab with the "-wait" flag. From the documenation:
"MATLAB is started by a separate starter program which normally launches MATLAB and then immediately quits. Using this option tells the starter program not to quit until MATLAB has terminated. This option is useful when you need to process the results from MATLAB in a script. Calling MATLAB with this option blocks the script from continuing until the results are generated."
Based on your response to my comment, let me answer your question with what I did for my application, that had a similar process to yours (albeit in C#). Instead of trying to force your process to wait for MATLAB to finish up (which is obviously not working right now), just wait for that CSV file to be written to. If you're worried about possibly having duplicates, then just append the current date and time to the end of the file, and that should do the trick.
I am trying to constantly monitor a process which is basically a Python program. If the program stops, then I have to start the program again. I am using another Python program to do so.
For example, say I have to constantly run a process called run_constantly.py. I initially run this program manually, which writes its process ID to the file "PID" (in the location out/PROCESSID/PID).
Now I run another program which has the following code to monitor the program run_constantly.py from a Linux environment:
def Monitor_Periodic_Process():
TIMER_RUNIN = 1800
foo = imp.load_source("Run_Module","run_constantly.py")
PROGRAM_TO_MONITOR = ['run_constantly.py','out/PROCESSID/PID']
while(1):
# call the function checkPID to see if the program is running or not
res = checkPID(PROGRAM_TO_MONITOR)
# if res is 0 then program is not running so schedule it
if (res == 0):
date_time = datetime.now()
scheduler.add_cron_job(foo.Run_Module, year=date_time.year, day=date_time.day, month=date_time.month, hour=date_time.hour, minute=date_time.minute+2)
scheduler.start()
scheduler.get_jobs()
time.sleep(TIMER_NOT_RUNIN)
continue
else:
#the process is running sleep and then monitor again
time.sleep(TIMER_RUNIN)
continue
I have not included the checkPID() function here. checkPID() basically checks if the process ID still exists (i.e. if the program is still running) and if it does not exist, it returns 0. In the above program, I check if res == 0, and if so, then I use Python's scheduler to schedule the program. However, the major problem that I am currently facing is that the process ID of this program and the run_constantly.py program turns to be same once I schedule the run_constantly.py using the scheduler.add_cron_job() function. So if the program run_constantly.py crashes, the following program still thinks that the run_constantly.py is running (since both process IDs are same), and therefore continues to go into the else loop to sleep and monitor again.
Can someone tell me how to solve this issue? Is there a simple way to constantly monitor a program and reschedule it when it has crashed?
There are many programs that can do this.
On Ubuntu there is upstart (installed by default)
Lots of people like http://supervisord.org/
monit as mentioned by #nathan
If you are looking for a python alternative there is a library that has just been released called circus which looks interesting.
And pretty much every linux distro probably has one of these built in.
The choice is really just down to which one you like better, but you would be far better off using one of these than writing it yourself.
Hope that helps
If you are willing to control the monitored program directly from python instead of using cron, have a look at the subprocess module :
The subprocess module allows you to spawn new processes,
connect to their input/output/error pipes, and obtain their return codes.
Check examples like track process status with python on SO for examples and references.
You could just use monit
http://mmonit.com/monit/
It monitors processes and restarts them (and other things.)
I thought I'd add a more versatile solution, which is one that I personally use all the time as well.
It's name is Immortal (source is at https://github.com/immortal/immortal)
To have it monitor and instantly restart a program if it stops, simply run the following command:
immortal <command>
So in your case I would run run_constantly.py like so:
immortal python run_constantly.py
The command ps aux | grep run_constantly.py should return 2 process IDs, one for the Immortal command, and one for the separate command Immortal started (just the regular command. As long as the Immortal process is running, run_constantly.py will stay running.