unpredictable behaviour with python subprocess calls - python

I'm writing a python script that performs a series of operations in a loop, by making subprocess calls, like so:
os.system('./svm_learn -z p -t 2 trial-input model')
os.system('./svm_classify test-input model pred')
os.system('python read-svm-rank.py')
score = os.popen('python scorer.py -g gold-test -i out').readline()
When I make the calls individually one after the other in the shell they work fine. But within the script they always break. I've traced the source of the error and it seems that the output files are getting truncated towards the end (leading me to believe that calls are being made without previous ones being completed).
I tried with subprocess.Popen and then using the wait() method of the Popen object, but to no avail. The script still breaks.
Any ideas what's going on here?

I'd probably first rewrite a little to use the subprocess module instead of the os module.
Then I'd probably scrutinize what's going wrong by studying a system call trace:
http://stromberg.dnsalias.org/~strombrg/debugging-with-syscall-tracers.html
Hopefully there'll be an "E" error code near the end of the file that'll tell you what error is being encountered.
Another option would be to comment out subsets of your subprocesses (assuming the n+1th doesn't depend heavily on the output of the nth), to pin down which one of them is having problems. After that, you could sprinkle some extra error reporting in the offending script to see what it's doing.
But if you're not put off by C-ish syscall traces, that might be easier.

Related

Is there any way to know the command-line options available for a separate program from Python?

I am relatively new to the python's subprocess and os modules. So, I was able to do the process execution like running bc, cat commands with python and putting the data in stdin and taking the result from stdout.
Now I want to first know that a process like cat accepts what flags through python code (If it is possible).
Then I want to execute a particular command with some flags set.
I googled it for both things and it seems that I got the solution for second one but with multiple ways. So, if anyone know how to do these things and do it in some standard kind of way, it would be much appreciated.
In the context of processes, those flags are called arguments, hence also the argument vector called argv. Their interpretation is 100% up to the program called. In other words, you have to read the manpages or other documentation for the programs you want to call.
There is one caveat though: If you don't invoke a program directly but via a shell, that shell is the actual process being started. It then also interprets wildcards. For example, if you run cat with the argument vector ['*'], it will output the content of the file named * if it exists or an error if it doesn't. If you run /bin/sh with ['-c', 'cat *'], the shell will first resolve * into all entries in the current directory and then pass these as separate arguments to cat.

Debug a Python program which seems paused for no reason

I am writing a Python program to analyze log files. So basically I have about 30000 medium-size log files and my Python script is designed to perform some simple (line-by-line) analysis of each log file. Roughly it takes less than 5 seconds to process one file.
So once I set up the processing, I just left it there and after about 14 hours when I came back, my Python script simply paused right after analyzing one log file; seems that it hasn't written into the file system for the analyzing output of this file, and that's it. No more proceeding.
I checked the memory usage, it seems fine (less than 1G), I also tried to write to the file system (touch test), it also works as normal. So my question is that, how should I proceed to debug the issue? Could anyone share some thoughts on that? I hope this is not too general. Thanks.
You may use Trace or track Python statement execution and/or The Python Debugger module.
Try this tool https://github.com/khamidou/lptrace with command:
sudo python lptrace -p <process_id>
It will print every python function your program invokes and may help you understand where your program stucks or in an infinity loop.
If it does not output anything, that's proberbly your program get stucks, so try
pstack <process_id>
to check the stack trace and find out where stucks. The output of pstack is c frames, but I believe somehow you can find something useful to solve your problem.

Control executed programm with python

I want to execute a testrun via bash, if the test needs too much time. So far, I found some good solutions here. But since the command kill does not work properly (when I use it correctly it says it is not used correctly), I decided to solve this problem using python. This is the Execution call I want to monitor:
EXE="C:/program.exe"
FILE="file.tpt"
HOME_DIR="C:/Home"
"$EXE" -vm-Xmx4096M --run build "$HOME_DIR/test/$FILE" "Auslieferung (ML) Execute"
(The opened *.exe starts a testrun which includes some simulink simulation runs - sometimes there are simulink errors - in this case, the execution time of the tests need too long and I want to restart the entire process).
First, I came up with the idea, calling a shell script containing these lines within a subprocess from python:
import subprocess
import time
process = subprocess.Popen('subprocess.sh', shell = True)
time.sleep(10)
process.terminate()
But when I use this, *.terminate() or *.kill() does not close the program I started with the subprocess call.
That´s why I am now trying to implement the entire call in python language. I got the following so far:
import subprocess
file = "somePath/file.tpt"
p = subprocess.Popen(["C:/program.exe", file])
Now I need to know, how to implement the second call "Auslieferung (ML) Execute" of the bash function. This call starts an intern testrun named "Auslieferung (ML) Execute". Any ideas? Or is it better to choose one of the other ways? Or can I get the "kill" option for bash somewhere, somehow?

How do I get the log file a program creates when running it with subprocess.call()?

I work with Gaussian, which is a program for molecular geometry optimization, among other applications. Gaussian can take days to end a single optimization so I decided to make a program on Python to send me an e-mail when it finishes running. The e-mail sending I figured out. The problem is that Gaussian automatically generates a log file and a chk file, which contains the actual results of the process and by using subprocess.call(['command'], shell=False) both files are not generated.
I also tried to solve the problem with os.system(command), which gives me the .log file and the .chk file, but the e-mail is sent without waiting for the optimization completion.
Another important thing, I have to run the entire process in the background, because as I said at the beginning it might take days to be over and I can't leave the terminal open that long.
by using subprocess.call(['command'], shell=False) both files are not generated.
Your comment suggests that you are trying to run subprocess.call(['g09 input.com &'], shell=False) that is wrong.
Your code should raise FileNotFoundError. If you don't see it; it means stderr is hidden. You should fix it (make sure that you can see the output of sys.stderr.write('stderr\n')). By default, stderr is not hidden i.e., the way you start your parent script is broken. To be able to disconnect from the session, try:
$ nohup python /path/to/your_script.py &>your_script.log &
or use screen, tmux.
shell=False (btw, it is default—no need to pass it explicitly) should hint strongly that call() function does not expect a shell command. And indeed, subprocess.call() accepts an executable and its parameters as a list instead—it does not run the shell:
subprocess.check_call(['g09', 'input.com', 'arg 2', 'etc'])
Note: check_call() raises an exception if g09 returns with a non-zero exit code (it indicates an error usually).

os.system() failing in python

I'm trying to parse some data and make graphs with python and there's an odd issue coming up. A call to os.system() seems to get lost somewhere.
The following three lines:
os.system('echo foo bar')
os.system('gnuplot test.gnuplot')
os.system('gnuplot --version')
Should print:
foo bar
Warning: empty x range [2012:2012], adjusting to [1991.88:2032.12]
gnuplot 4.4 patchlevel 2
But the only significant command in the middle seems to get dropped. The script still runs the echo and version check, and running gnuplot by itself (the gnuplot shell) works too, but there is no warning and no file output from gnuplot.
Why is this command dropped, and why completely silently?
In case it's helpful, the invocation should start gnuplot, it should open a couple of files (the instructions and a data file indicated therein) and write out to an SVG file. I tried deleting the target file so it wouldn't have to overwrite, but to no avail.
This is python 3.2 on Ubuntu Natty x86_64 virtual machine with the 2.6.38-8-virtual kernel.
Is the warning printed to stderr, and that is intercepted somehow?
Try using subprocess instead, for example using
subprocess.check_output(cmd, stderr=subprocess.STDOUT)
and checking the output.
(or plaing subprocess.call might work better than os.system)
So, it turned out the issue was something I failed to mention. Earlier in the script test.gnuplot and test.data were written, but I neglected to call the file objects' close() and verify that they got closed (still don't know how to do that last part so for now it cycles for a bit). So there was some unexpected behaviour going on there causing gnuplot to see two unreadable files, take no action, produce no output, and return 0.
I guess nobody gets points for this one.
Edit: I finally figured it out with the help of strace. Don't know how I did things before I learned how to use it.
don't use os.system. Use subprocess module.
os.system documentation says:
The subprocess module provides more powerful facilities for spawning
new processes and retrieving their results; using that module is
preferable to using this function.
Try this:
subprocess.check_call(['gnuplot', 'test.gnuplot'])

Categories