Non blocking subprocess.call - python

I'm trying to make a non blocking subprocess call to run a slave.py script from my main.py program. I need to pass args from main.py to slave.py once when it(slave.py) is first started via subprocess.call after this slave.py runs for a period of time then exits.
main.py
for insert, (list) in enumerate(list, start =1):
sys.args = [list]
subprocess.call(["python", "slave.py", sys.args], shell = True)
{loop through program and do more stuff..}
And my slave script
slave.py
print sys.args
while True:
{do stuff with args in loop till finished}
time.sleep(30)
Currently, slave.py blocks main.py from running the rest of its tasks, I simply want slave.py to be independent of main.py, once I've passed args to it. The two scripts no longer need to communicate.
I've found a few posts on the net about non blocking subprocess.call but most of them are centered on requiring communication with slave.py at some-point which I currently do not need. Would anyone know how to implement this in a simple fashion...?

You should use subprocess.Popen instead of subprocess.call.
Something like:
subprocess.Popen(["python", "slave.py"] + sys.argv[1:])
From the docs on subprocess.call:
Run the command described by args. Wait for command to complete, then return the returncode attribute.
(Also don't use a list to pass in the arguments if you're going to use shell = True).
Here's a MCVE1 example that demonstrates a non-blocking suprocess call:
import subprocess
import time
p = subprocess.Popen(['sleep', '5'])
while p.poll() is None:
print('Still sleeping')
time.sleep(1)
print('Not sleeping any longer. Exited with returncode %d' % p.returncode)
An alternative approach that relies on more recent changes to the python language to allow for co-routine based parallelism is:
# python3.5 required but could be modified to work with python3.4.
import asyncio
async def do_subprocess():
print('Subprocess sleeping')
proc = await asyncio.create_subprocess_exec('sleep', '5')
returncode = await proc.wait()
print('Subprocess done sleeping. Return code = %d' % returncode)
async def sleep_report(number):
for i in range(number + 1):
print('Slept for %d seconds' % i)
await asyncio.sleep(1)
loop = asyncio.get_event_loop()
tasks = [
asyncio.ensure_future(do_subprocess()),
asyncio.ensure_future(sleep_report(5)),
]
loop.run_until_complete(asyncio.gather(*tasks))
loop.close()
1Tested on OS-X using python2.7 & python3.6

There's three levels of thoroughness here.
As mgilson says, if you just swap out subprocess.call for subprocess.Popen, keeping everything else the same, then main.py will not wait for slave.py to finish before it continues. That may be enough by itself. If you care about zombie processes hanging around, you should save the object returned from subprocess.Popen and at some later point call its wait method. (The zombies will automatically go away when main.py exits, so this is only a serious problem if main.py runs for a very long time and/or might create many subprocesses.) And finally, if you don't want a zombie but you also don't want to decide where to do the waiting (this might be appropriate if both processes run for a long and unpredictable time afterward), use the python-daemon library to have the slave disassociate itself from the master -- in that case you can continue using subprocess.call in the master.

For Python 3.8.x
import shlex
import subprocess
cmd = "<full filepath plus arguments of child process>"
cmds = shlex.split(cmd)
p = subprocess.Popen(cmds, start_new_session=True)
This will allow the parent process to exit while the child process continues to run. Not sure about zombies.
Tested on Python 3.8.1 on macOS 10.15.5

The easiest solution for your non-blocking situation would be to add & at the end of the Popen like this:
subprocess.Popen(["python", "slave.py", " &"])
This does not block the execution of the rest of the program.

If you want to start a function several times with different arguments in a non-blocking way, you can use the ThreadPoolExecuter.
You submit your function calls to the executer like this
from concurrent.futures import ThreadPoolExecutor
def threadmap(fun, xs):
with ThreadPoolExecutor(max_workers=8) as executer:
return list(executer.map(fun, xs))

Related

Run Flask app inside of another flask app [duplicate]

I'm trying to make a non blocking subprocess call to run a slave.py script from my main.py program. I need to pass args from main.py to slave.py once when it(slave.py) is first started via subprocess.call after this slave.py runs for a period of time then exits.
main.py
for insert, (list) in enumerate(list, start =1):
sys.args = [list]
subprocess.call(["python", "slave.py", sys.args], shell = True)
{loop through program and do more stuff..}
And my slave script
slave.py
print sys.args
while True:
{do stuff with args in loop till finished}
time.sleep(30)
Currently, slave.py blocks main.py from running the rest of its tasks, I simply want slave.py to be independent of main.py, once I've passed args to it. The two scripts no longer need to communicate.
I've found a few posts on the net about non blocking subprocess.call but most of them are centered on requiring communication with slave.py at some-point which I currently do not need. Would anyone know how to implement this in a simple fashion...?
You should use subprocess.Popen instead of subprocess.call.
Something like:
subprocess.Popen(["python", "slave.py"] + sys.argv[1:])
From the docs on subprocess.call:
Run the command described by args. Wait for command to complete, then return the returncode attribute.
(Also don't use a list to pass in the arguments if you're going to use shell = True).
Here's a MCVE1 example that demonstrates a non-blocking suprocess call:
import subprocess
import time
p = subprocess.Popen(['sleep', '5'])
while p.poll() is None:
print('Still sleeping')
time.sleep(1)
print('Not sleeping any longer. Exited with returncode %d' % p.returncode)
An alternative approach that relies on more recent changes to the python language to allow for co-routine based parallelism is:
# python3.5 required but could be modified to work with python3.4.
import asyncio
async def do_subprocess():
print('Subprocess sleeping')
proc = await asyncio.create_subprocess_exec('sleep', '5')
returncode = await proc.wait()
print('Subprocess done sleeping. Return code = %d' % returncode)
async def sleep_report(number):
for i in range(number + 1):
print('Slept for %d seconds' % i)
await asyncio.sleep(1)
loop = asyncio.get_event_loop()
tasks = [
asyncio.ensure_future(do_subprocess()),
asyncio.ensure_future(sleep_report(5)),
]
loop.run_until_complete(asyncio.gather(*tasks))
loop.close()
1Tested on OS-X using python2.7 & python3.6
There's three levels of thoroughness here.
As mgilson says, if you just swap out subprocess.call for subprocess.Popen, keeping everything else the same, then main.py will not wait for slave.py to finish before it continues. That may be enough by itself. If you care about zombie processes hanging around, you should save the object returned from subprocess.Popen and at some later point call its wait method. (The zombies will automatically go away when main.py exits, so this is only a serious problem if main.py runs for a very long time and/or might create many subprocesses.) And finally, if you don't want a zombie but you also don't want to decide where to do the waiting (this might be appropriate if both processes run for a long and unpredictable time afterward), use the python-daemon library to have the slave disassociate itself from the master -- in that case you can continue using subprocess.call in the master.
For Python 3.8.x
import shlex
import subprocess
cmd = "<full filepath plus arguments of child process>"
cmds = shlex.split(cmd)
p = subprocess.Popen(cmds, start_new_session=True)
This will allow the parent process to exit while the child process continues to run. Not sure about zombies.
Tested on Python 3.8.1 on macOS 10.15.5
The easiest solution for your non-blocking situation would be to add & at the end of the Popen like this:
subprocess.Popen(["python", "slave.py", " &"])
This does not block the execution of the rest of the program.
If you want to start a function several times with different arguments in a non-blocking way, you can use the ThreadPoolExecuter.
You submit your function calls to the executer like this
from concurrent.futures import ThreadPoolExecutor
def threadmap(fun, xs):
with ThreadPoolExecutor(max_workers=8) as executer:
return list(executer.map(fun, xs))

Python subprocess always waits for programm [duplicate]

I'm trying to port a shell script to the much more readable python version. The original shell script starts several processes (utilities, monitors, etc.) in the background with "&". How can I achieve the same effect in python? I'd like these processes not to die when the python scripts complete. I am sure it's related to the concept of a daemon somehow, but I couldn't find how to do this easily.
While jkp's solution works, the newer way of doing things (and the way the documentation recommends) is to use the subprocess module. For simple commands its equivalent, but it offers more options if you want to do something complicated.
Example for your case:
import subprocess
subprocess.Popen(["rm","-r","some.file"])
This will run rm -r some.file in the background. Note that calling .communicate() on the object returned from Popen will block until it completes, so don't do that if you want it to run in the background:
import subprocess
ls_output=subprocess.Popen(["sleep", "30"])
ls_output.communicate() # Will block for 30 seconds
See the documentation here.
Also, a point of clarification: "Background" as you use it here is purely a shell concept; technically, what you mean is that you want to spawn a process without blocking while you wait for it to complete. However, I've used "background" here to refer to shell-background-like behavior.
Note: This answer is less current than it was when posted in 2009. Using the subprocess module shown in other answers is now recommended in the docs
(Note that the subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using these functions.)
If you want your process to start in the background you can either use system() and call it in the same way your shell script did, or you can spawn it:
import os
os.spawnl(os.P_DETACH, 'some_long_running_command')
(or, alternatively, you may try the less portable os.P_NOWAIT flag).
See the documentation here.
You probably want the answer to "How to call an external command in Python".
The simplest approach is to use the os.system function, e.g.:
import os
os.system("some_command &")
Basically, whatever you pass to the system function will be executed the same as if you'd passed it to the shell in a script.
I found this here:
On windows (win xp), the parent process will not finish until the longtask.py has finished its work. It is not what you want in CGI-script. The problem is not specific to Python, in PHP community the problems are the same.
The solution is to pass DETACHED_PROCESS Process Creation Flag to the underlying CreateProcess function in win API. If you happen to have installed pywin32 you can import the flag from the win32process module, otherwise you should define it yourself:
DETACHED_PROCESS = 0x00000008
pid = subprocess.Popen([sys.executable, "longtask.py"],
creationflags=DETACHED_PROCESS).pid
Use subprocess.Popen() with the close_fds=True parameter, which will allow the spawned subprocess to be detached from the Python process itself and continue running even after Python exits.
https://gist.github.com/yinjimmy/d6ad0742d03d54518e9f
import os, time, sys, subprocess
if len(sys.argv) == 2:
time.sleep(5)
print 'track end'
if sys.platform == 'darwin':
subprocess.Popen(['say', 'hello'])
else:
print 'main begin'
subprocess.Popen(['python', os.path.realpath(__file__), '0'], close_fds=True)
print 'main end'
Both capture output and run on background with threading
As mentioned on this answer, if you capture the output with stdout= and then try to read(), then the process blocks.
However, there are cases where you need this. For example, I wanted to launch two processes that talk over a port between them, and save their stdout to a log file and stdout.
The threading module allows us to do that.
First, have a look at how to do the output redirection part alone in this question: Python Popen: Write to stdout AND log file simultaneously
Then:
main.py
#!/usr/bin/env python3
import os
import subprocess
import sys
import threading
def output_reader(proc, file):
while True:
byte = proc.stdout.read(1)
if byte:
sys.stdout.buffer.write(byte)
sys.stdout.flush()
file.buffer.write(byte)
else:
break
with subprocess.Popen(['./sleep.py', '0'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc1, \
subprocess.Popen(['./sleep.py', '10'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc2, \
open('log1.log', 'w') as file1, \
open('log2.log', 'w') as file2:
t1 = threading.Thread(target=output_reader, args=(proc1, file1))
t2 = threading.Thread(target=output_reader, args=(proc2, file2))
t1.start()
t2.start()
t1.join()
t2.join()
sleep.py
#!/usr/bin/env python3
import sys
import time
for i in range(4):
print(i + int(sys.argv[1]))
sys.stdout.flush()
time.sleep(0.5)
After running:
./main.py
stdout get updated every 0.5 seconds for every two lines to contain:
0
10
1
11
2
12
3
13
and each log file contains the respective log for a given process.
Inspired by: https://eli.thegreenplace.net/2017/interacting-with-a-long-running-child-process-in-python/
Tested on Ubuntu 18.04, Python 3.6.7.
You probably want to start investigating the os module for forking different threads (by opening an interactive session and issuing help(os)). The relevant functions are fork and any of the exec ones. To give you an idea on how to start, put something like this in a function that performs the fork (the function needs to take a list or tuple 'args' as an argument that contains the program's name and its parameters; you may also want to define stdin, out and err for the new thread):
try:
pid = os.fork()
except OSError, e:
## some debug output
sys.exit(1)
if pid == 0:
## eventually use os.putenv(..) to set environment variables
## os.execv strips of args[0] for the arguments
os.execv(args[0], args)
You can use
import os
pid = os.fork()
if pid == 0:
Continue to other code ...
This will make the python process run in background.
I haven't tried this yet but using .pyw files instead of .py files should help. pyw files dosen't have a console so in theory it should not appear and work like a background process.

Check if background tasks are done

I am running a lot of simulations in parallel on background using:
for i in range (a, b):
os.system("python xxx.py &")
// To-Add: check if tasks are complete, then process the results
xxx.py calls another software.
Is there any way to check if the tasks are completed, so I can process the result?
If you use subprocess then you will have much better control over the external processes, including being able to start multiple processes without blocking, and check if they have completed. You will probably have to collect a set of Popen instances and use poll to see if they are complete.
Try this code:
#!/usr/bin/python3
import subprocess
import time
proc = subprocess.Popen('python xxx.py', shell=True)
while proc.poll() == None:
print ( proc.pid);
time.sleep(1);

How to open a new command prompt with subprocess.run? [duplicate]

I'm trying to make a non blocking subprocess call to run a slave.py script from my main.py program. I need to pass args from main.py to slave.py once when it(slave.py) is first started via subprocess.call after this slave.py runs for a period of time then exits.
main.py
for insert, (list) in enumerate(list, start =1):
sys.args = [list]
subprocess.call(["python", "slave.py", sys.args], shell = True)
{loop through program and do more stuff..}
And my slave script
slave.py
print sys.args
while True:
{do stuff with args in loop till finished}
time.sleep(30)
Currently, slave.py blocks main.py from running the rest of its tasks, I simply want slave.py to be independent of main.py, once I've passed args to it. The two scripts no longer need to communicate.
I've found a few posts on the net about non blocking subprocess.call but most of them are centered on requiring communication with slave.py at some-point which I currently do not need. Would anyone know how to implement this in a simple fashion...?
You should use subprocess.Popen instead of subprocess.call.
Something like:
subprocess.Popen(["python", "slave.py"] + sys.argv[1:])
From the docs on subprocess.call:
Run the command described by args. Wait for command to complete, then return the returncode attribute.
(Also don't use a list to pass in the arguments if you're going to use shell = True).
Here's a MCVE1 example that demonstrates a non-blocking suprocess call:
import subprocess
import time
p = subprocess.Popen(['sleep', '5'])
while p.poll() is None:
print('Still sleeping')
time.sleep(1)
print('Not sleeping any longer. Exited with returncode %d' % p.returncode)
An alternative approach that relies on more recent changes to the python language to allow for co-routine based parallelism is:
# python3.5 required but could be modified to work with python3.4.
import asyncio
async def do_subprocess():
print('Subprocess sleeping')
proc = await asyncio.create_subprocess_exec('sleep', '5')
returncode = await proc.wait()
print('Subprocess done sleeping. Return code = %d' % returncode)
async def sleep_report(number):
for i in range(number + 1):
print('Slept for %d seconds' % i)
await asyncio.sleep(1)
loop = asyncio.get_event_loop()
tasks = [
asyncio.ensure_future(do_subprocess()),
asyncio.ensure_future(sleep_report(5)),
]
loop.run_until_complete(asyncio.gather(*tasks))
loop.close()
1Tested on OS-X using python2.7 & python3.6
There's three levels of thoroughness here.
As mgilson says, if you just swap out subprocess.call for subprocess.Popen, keeping everything else the same, then main.py will not wait for slave.py to finish before it continues. That may be enough by itself. If you care about zombie processes hanging around, you should save the object returned from subprocess.Popen and at some later point call its wait method. (The zombies will automatically go away when main.py exits, so this is only a serious problem if main.py runs for a very long time and/or might create many subprocesses.) And finally, if you don't want a zombie but you also don't want to decide where to do the waiting (this might be appropriate if both processes run for a long and unpredictable time afterward), use the python-daemon library to have the slave disassociate itself from the master -- in that case you can continue using subprocess.call in the master.
For Python 3.8.x
import shlex
import subprocess
cmd = "<full filepath plus arguments of child process>"
cmds = shlex.split(cmd)
p = subprocess.Popen(cmds, start_new_session=True)
This will allow the parent process to exit while the child process continues to run. Not sure about zombies.
Tested on Python 3.8.1 on macOS 10.15.5
The easiest solution for your non-blocking situation would be to add & at the end of the Popen like this:
subprocess.Popen(["python", "slave.py", " &"])
This does not block the execution of the rest of the program.
If you want to start a function several times with different arguments in a non-blocking way, you can use the ThreadPoolExecuter.
You submit your function calls to the executer like this
from concurrent.futures import ThreadPoolExecutor
def threadmap(fun, xs):
with ThreadPoolExecutor(max_workers=8) as executer:
return list(executer.map(fun, xs))

running several system commands in parallel in Python

I write a simple script that executes a system command on a sequence of files.
To speed things up, I'd like to run them in parallel, but not all at once - i need to control maximum number of simultaneously running commands.
What whould be the easiest way to approach this ?
If you are calling subprocesses anyway, I don't see the need to use a thread pool. A basic implementation using the subprocess module would be
import subprocess
import os
import time
files = <list of file names>
command = "/bin/touch"
processes = set()
max_processes = 5
for name in files:
processes.add(subprocess.Popen([command, name]))
if len(processes) >= max_processes:
os.wait()
processes.difference_update([
p for p in processes if p.poll() is not None])
On Windows, os.wait() is not available (nor any other method of waiting for any child process to terminate). You can work around this by polling in certain intervals:
for name in files:
processes.add(subprocess.Popen([command, name]))
while len(processes) >= max_processes:
time.sleep(.1)
processes.difference_update([
p for p in processes if p.poll() is not None])
The time to sleep for depends on the expected execution time of the subprocesses.
The answer from Sven Marnach is almost right, but there is a problem. If one of the last max_processes processes ends, the main program will try to start another process, and the for looping will end. This will close the main process, which can in turn close the child processes. For me, this behavior happened with the screen command.
The code in Linux will be like this (and will only work on python2.7):
import subprocess
import os
import time
files = <list of file names>
command = "/bin/touch"
processes = set()
max_processes = 5
for name in files:
processes.add(subprocess.Popen([command, name]))
if len(processes) >= max_processes:
os.wait()
processes.difference_update(
[p for p in processes if p.poll() is not None])
#Check if all the child processes were closed
for p in processes:
if p.poll() is None:
p.wait()
You need to combine a Semaphore object with threads. A Semaphore is an object that lets you limit the number of threads that are running in a given section of code. In this case we'll use a semaphore to limit the number of threads that can run the os.system call.
First we import the modules we need:
#!/usr/bin/python
import threading
import os
Next we create a Semaphore object. The number four here is the number of threads that can acquire the semaphore at one time. This limits the number of subprocesses that can be run at once.
semaphore = threading.Semaphore(4)
This function simply wraps the call to the subprocess in calls to the Semaphore.
def run_command(cmd):
semaphore.acquire()
try:
os.system(cmd)
finally:
semaphore.release()
If you're using Python 2.6+ this can become even simpler as you can use the 'with' statement to perform both the acquire and release calls.
def run_command(cmd):
with semaphore:
os.system(cmd)
Finally, to show that this works as expected we'll call the "sleep 10" command eight times.
for i in range(8):
threading.Thread(target=run_command, args=("sleep 10", )).start()
Running the script using the 'time' program shows that it only takes 20 seconds as two lots of four sleeps are run in parallel.
aw#aw-laptop:~/personal/stackoverflow$ time python 4992400.py
real 0m20.032s
user 0m0.020s
sys 0m0.008s
I merged the solutions by Sven and Thuener into one that waits for trailing processes and also stops if one of the processes crashes:
def removeFinishedProcesses(processes):
""" given a list of (commandString, process),
remove those that have completed and return the result
"""
newProcs = []
for pollCmd, pollProc in processes:
retCode = pollProc.poll()
if retCode==None:
# still running
newProcs.append((pollCmd, pollProc))
elif retCode!=0:
# failed
raise Exception("Command %s failed" % pollCmd)
else:
logging.info("Command %s completed successfully" % pollCmd)
return newProcs
def runCommands(commands, maxCpu):
processes = []
for command in commands:
logging.info("Starting process %s" % command)
proc = subprocess.Popen(shlex.split(command))
procTuple = (command, proc)
processes.append(procTuple)
while len(processes) >= maxCpu:
time.sleep(.2)
processes = removeFinishedProcesses(processes)
# wait for all processes
while len(processes)>0:
time.sleep(0.5)
processes = removeFinishedProcesses(processes)
logging.info("All processes completed")
What you are asking for is a thread pool. There is a fixed number of threads that can be used to execute tasks. When is not running a task, it waits on a task queue in order to get a new piece of code to execute.
There is this thread pool module, but there is a comment saying it is not considered complete yet. There may be other packages out there, but this was the first one I found.
If your running system commands you can just create the process instances with the subprocess module, call them as you want. There shouldn't be any need to thread (its unpythonic) and multiprocess seems a tad overkill for this task.
This answer is very similar to other answers present here but it uses a list instead of sets.
For some reason when using those answers I was getting a runtime error regarding the size of the set changing.
from subprocess import PIPE
import subprocess
import time
def submit_job_max_len(job_list, max_processes):
sleep_time = 0.1
processes = list()
for command in job_list:
print 'running {n} processes. Submitting {proc}.'.format(n=len(processes),
proc=str(command))
processes.append(subprocess.Popen(command, shell=False, stdout=None,
stdin=PIPE))
while len(processes) >= max_processes:
time.sleep(sleep_time)
processes = [proc for proc in processes if proc.poll() is None]
while len(processes) > 0:
time.sleep(sleep_time)
processes = [proc for proc in processes if proc.poll() is None]
cmd = '/bin/bash run_what.sh {n}'
job_list = ((cmd.format(n=i)).split() for i in range(100))
submit_job_max_len(job_list, max_processes=50)

Categories