Start background process in python via subprocess and write output to file - python

Dear stackoverflow users,
I'm looking for a solution for a probably quite easy problem. I want to automate some quantum chemical calculations and ran into a small problem.
Normally you start your quantum chemical programm (in my case it's called orca) with your input file (*.inp) on a remote server as a background process and pipe the output into an outputfile (*.out) via
nohup orca H2.inp >& H2.out &
or something similar.
Now I wanted to use a python script (with some templating) to write the input file automatically. At the end the script should start the calculation in a way that I could log out of the server without stopping orca. I tried that with
subprocess.run(["orca", input_file], stdout=output_file)
but so far it did not work. How do I "emulate" the command given at the top with the subprocess module?
Regards
Update
I have one file that is called H2.xyz. The script reads and splits the filename by the point and creates an input file name H2.inp and the output should be written into the file H2.out.
Update 2
The input file is derived from the *xyz file via
xyzfile = str(sys.argv[1])
input_file = xyzfile.split(".")[0] + ".inp"
output_file = xyzfile.split(".")[0] + ".out"
and is created within the script via templating. In the end I want to run the script in the following way:
python3 script.py H2_0_1.xyz

Why not simply:
subprocess.Popen(f'orca {input_file} >& {output_file}',
shell=True, stdin=None, stdout=None, stderr=None, close_fds=True)
More info:
Run Process and Don't Wait

For me (Windows, Python 2.7) the method call works very fine like this:
with open('H2.out', 'a') as out :
subprocess.call(['orca', infile], stdout=out,
stderr=out,
shell=True) # Yes, I know. But It's Windows.
On Linux you maybe do not need shell=True for a list of arguments.

Is the usage of subprocess important? If not, you could use os.system.
The Python call would get really short, in your case
os.system("nohup orca H2.inp >& H2.out &")
should do the trick.

I had the same problem not long ago.
Here is my solution:
commandLineCode = "nohup orca H2.inp >& H2.out &"
try:
proc = subprocess.Popen(commandLineCode,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
cwd = workingDir)
except OSError:
print("Windows Error occured")
print(traceback.format_exc())
timeoutInSeconds = 100
try:
outs, errs = proc.communicate(timeout = timeoutInSeconds)
except subprocess.TimeoutExpired:
print("timeout")
proc.kill()
outs, errs = proc.communicate()
stdoutDecode = outs.decode("utf-8")
stderrDecode = errs.decode("utf-8")
for line in stdoutDecode.splitlines():
# write line to outputFile
if stderrDecode:
for line in stderrDecode.splitlines():
# write line to error log
The OSError exception is pretty important since you never now what your OS might do wrong.
For more on the communicate() command which actually starts the process read:
https://docs.python.org/3/library/subprocess.html#subprocess.Popen.communicate

Related

saving subprocess output in a variable

I am using fping program to get network latencies of a list of hosts. Its a shell program but I want to use it in my python script and save the output in some database.
I am using subprocess.call() like this:
import subprocess
subprocess.call(["fping","-l","google.com"])
The problem with this is its an infinite loop given indicated by -l flag so it will go on printing the input to the console. But after every output, I need some sort of callback so that I can save it in db. How can I do that?
I looked for subprocess.check_output() but its not working.
This may help you:
def execute(cmd):
popen = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE,
universal_newlines=True)
for stdout_line in iter(popen.stdout.readline, ""):
yield stdout_line
popen.stdout.close()
return_code = popen.wait()
if return_code:
raise subprocess.CalledProcessError(return_code, cmd)
So you can basically execute:
for line in execute("fping -l google.com"):
print(line)
For example.

python subprocess module hangs for spark-submit command when writing STDOUT

I have a python script that is used to submit spark jobs using the spark-submit tool. I want to execute the command and write the output both to STDOUT and a logfile in real time. i'm using python 2.7 on a ubuntu server.
This is what I have so far in my SubmitJob.py script
#!/usr/bin/python
# Submit the command
def submitJob(cmd, log_file):
with open(log_file, 'w') as fh:
process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
while True:
output = process.stdout.readline()
if output == '' and process.poll() is not None:
break
if output:
print output.strip()
fh.write(output)
rc = process.poll()
return rc
if __name__ == "__main__":
cmdList = ["dse", "spark-submit", "--spark-master", "spark://127.0.0.1:7077", "--class", "com.spark.myapp", "./myapp.jar"]
log_file = "/tmp/out.log"
exist_status = submitJob(cmdList, log_file)
print "job finished with status ",exist_status
The strange thing is, when I execute the same command direcly in the shell it works fine and produces output on screen as the proggram proceeds.
So it looks like something is wrong in the way I'm using the subprocess.PIPE for stdout and writing the file.
What's the current recommended way to use subprocess module for writing to stdout and log file in real time line by line? I see bunch of options on the internet but not sure which is correct or latest.
thanks
Figured out what the problem was.
I was trying to redirect both stdout n stderr to pipe to display on screen. This seems to block the stdout when stderr is present. If I remove the stderr=stdout argument from Popen, it works fine. So for spark-submit it looks like you don't need to redirect stderr explicitly as it already does this implicitly
To print the Spark log
One can call the commandList given by user330612
cmdList = ["spark-submit", "--spark-master", "spark://127.0.0.1:7077", "--class", "com.spark.myapp", "./myapp.jar"]
Then it can be printed by using subprocess, remember to use communicate() to prevent deadlocks https://docs.python.org/2/library/subprocess.html
Warning Deadlock when using stdout=PIPE and/or stderr=PIPE and the child process generates enough output to a pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use communicate() to avoid that. Here below is the code to print the log.
import subprocess
p = subprocess.Popen(cmdList,stdout=subprocess.PIPE,stdout=subprocess.PIPE,stderr=subprocess.PIPE)
stdout, stderr = p.communicate()
stderr=stderr.splitlines()
stdout=stdout.splitlines()
for line in stderr:
print line #now it can be printed line by line to a file or something else, for the log
for line in stdout:
print line #for the output
More information about subprocess and printing lines can be found at:
https://pymotw.com/2/subprocess/

How to redirect print and stdout to a pipe and read it from parent process?

If possible I would like to not use subProcess.popen. The reason I want to capture the stdout of the process started by the child is because I need to save the output of the child in a variable to display it back later. However I have yet to find a way to do so anywhere. I also need to activate multiple programs without necessarily closing the one that's active. I also need to be controlling the child process whit the parent process.
I'm launching a subprocess like this
listProgram = ["./perroquet.py"]
listOutput = ["","",""]
tubePerroquet = os.pipe()
pipeMain = os.pipe()
pipeAge = os.pipe()
pipeSavoir = os.pipe()
pid = os.fork()
process = 1
if pid == 0:
os.close(pipePerroquet[1])
os.dup2(pipePerroquet[0],0)
sys.stdout = os.fdopen(tubeMain[1], 'w')
os.execvp("./perroquet.py", listProgram)
Now as you can see I'm launching the program with os.execvp and using os.dup2() to redirect the stdout of the child. However I'm not sure of what I've done in the code and want to know of the correct way to redirect stdout with os.dup2 and then be able to read it in the parent process.
Thank you for your help.
I cannot understand why you do not want to use the excellent subprocess module that could save you a lot of boiler plate code (and as much error possibilities ...). Anyway, I assume perroquet.py is a python script, not an executable progam. Shell know how to find the correct interpretor for scripts, but exec family are low-level functions that expect a real executable program.
You should at least have something like :
listProgram = [ "python", "./perroquet.py","",""]
...
os.execvp("python", listProgram)
But I'd rather use :
prog = subprocess.Popen(("python", "./perroquet.py", "", ""), stdout = PIPE)
or even as you are already in python import it and directly call the functions from there.
EDIT :
It looks thart what you really want is :
user gives you a command (can be almost anything)
[ you validate that the command is safe ] - unsure if you intend to do it but you should ...
you make the shell execute the command and get its output - you may want to read stderr too and control exit code
You should try something like
while True:
cmd = raw_input("commande :") # input with Python 3
if cmd.strip().lower() == exit: break
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE,
stderr=subprocess.PIPE, shell=True)
out, err = proc.communicate()
code = proc.returncode
print("OUT", out, "ERR", err, "CODE", code)
It is absolutely unsafe, since this code executes any command as the underlying shell would do (include rm -rf *, rd /s/q ., ...), but it gives you the output, the output and the return code of the command, and it can be used is a loop. The only limitation is that as you use a different shell for each command, you cannot use commands that change shell environment - they will be executed but will have no effect.
Here's a solution if you need to extract any changes to the environment
from subprocess import Popen, PIPE
import os
def execute_and_get_env(cmd, initial_env=None):
if initial_env is None:
initial_env = os.environ
r_fd, w_fd = os.pipe()
write_env = "; env >&{}".format(w_fd)
p = Popen(cmd + write_env, shell=True, env=initial_env, pass_fds=[w_fd], stdout=PIPE, stderr=PIPE)
output, error = p.communicate()
# this will cause problems if the environment gets very large as
# writing to the pipe will hang because it gets full and we only
# read from the pipe when the process is over
os.close(w_fd)
with open(r_fd) as f:
env = dict(line[:-1].split("=", 1) for line in f)
return output, error, env
export_cmd = "export my_var='hello world'"
echo_cmd = "echo $my_var"
out, err, env = execute_and_get_env(export_cmd)
out, err, env = execute_and_get_env(echo_cmd, env)
print(out)

Calling ffmpeg kills script in background only

I've got a python script that calls ffmpeg via subprocess to do some mp3 manipulations. It works fine in the foreground, but if I run it in the background, it gets as far as the ffmpeg command, which itself gets as far as dumping its config into stderr. At this point, everything stops and the parent task is reported as stopped, without raising an exception anywhere. I've tried a few other simple commands in the place of ffmpeg, they execute normally in foreground or background.
This is the minimal example of the problem:
import subprocess
inf = "3HTOSD.mp3"
outf = "out.mp3"
args = [ "ffmpeg",
"-y",
"-i", inf,
"-ss", "0",
"-t", "20",
outf
]
print "About to do"
result = subprocess.call(args)
print "Done"
I really can't work out why or how a wrapped process can cause the parent to terminate without at least raising an error, and how it only happens in so niche a circumstance. What is going on?
Also, I'm aware that ffmpeg isn't the nicest of packages, but I'm interfacing with something that has using ffmpeg compiled into it, so using it again seems sensible.
It might be related to Linux process in background - “Stopped” in jobs? e.g., using parent.py:
from subprocess import check_call
check_call(["python", "-c", "import sys; sys.stdin.readline()"])
should reproduce the issue: "parent.py script shown as stopped" if you run it in bash as a background job:
$ python parent.py &
[1] 28052
$ jobs
[1]+ Stopped python parent.py
If the parent process is in an orphaned process group then it is killed on receiving SIGTTIN signal (a signal to stop).
The solution is to redirect the input:
import os
from subprocess import check_call
try:
from subprocess import DEVNULL
except ImportError: # Python 2
DEVNULL = open(os.devnull, 'r+b', 0)
check_call(["python", "-c", "import sys; sys.stdin.readline()"], stdin=DEVNULL)
If you don't need to see ffmpeg stdout/stderr; you could also redirect them to /dev/null:
check_call(ffmpeg_cmd, stdin=DEVNULL, stdout=DEVNULL, stderr=STDOUT)
I like to use the commands module. It's simpler to use in my opinion.
import commands
cmd = "ffmpeg -y -i %s -ss 0 -t 20 %s 2>&1" % (inf, outf)
status, output = commands.getstatusoutput(cmd)
if status != 0:
raise Exception(output)
As a side note, sometimes PATH can be an issue, and you might want to use an absolute path to the ffmpeg binary.
matt#goliath:~$ which ffmpeg
/opt/local/bin/ffmpeg
From the python/subprocess/call documentation:
Wait for command to complete, then return the returncode attribute.
So as long as the process you called does not exit, your program does not go on.
You should set up a Popen process object, put its standard output and error in different buffers/streams and when there is an error, you terminate the process.
Maybe something like this works:
proc = subprocess.Popen(args, stderr = subprocess.PIPE) # puts stderr into a new stream
while proc.poll() is None:
try:
err = proc.stderr.read()
except: continue
else:
if err:
proc.terminate()
break

Help with wrapping a command line tool in Python

I'm really stuck with a problem I'm hoping someone can help me with. I'm trying to create a wrapper in Python3.1 for a command line program called spooky. I can successfully run this program on the command line like this:
$ spooky -a 4 -b .97
My first Python wrapper attempt for spooky looked like this:
import subprocess
start = "4"
end = ".97"
spooky_path = '/Users/path/to/spooky'
cmd = [spooky_path, '-a', start, '-b', end]
process = subprocess.Popen(cmd, stdout=subprocess.PIPE)
process.wait()
print('Done')
The above code prints Done, but does not execute the program spooky
Next I tried to just execute the program on the command line like this:
$ /Users/path/to/spooky -a 4 -b .97
The above code also fails, and provides no helpful errors.
My question is: How can I get Python to run this program by sending spooky -a 4 -b .97 to the command line? I would VERY much appreciate any help you can provide. Thanks in advance.
You need to drop the stdout=subprocess.PIPE. Doing that disconnects the stdout of your process from Python's stdout and makes it retrievable using the Popen.communicate() function, like so:
import subprocess
spooky_path = 'ls'
cmd = [spooky_path, '-l']
process = subprocess.Popen(cmd, stdout=subprocess.PIPE)
output = process.communicate()[0]
print "Output:", output
process.wait()
print('Done')
To make it print directly you can use it without the stdout argument:
process = subprocess.Popen(cmd)
Or you can use the call function:
process = subprocess.call(cmd)
Try making your command into a single string:
cmd = 'spooky_path -a start -b end'
process = subprocess.Popen(cmd, shell=True)

Categories