Subprocess doesn't respect arguments when using multiprocessing - python

The main objective here is to create a daemon-spawning function. The daemons need to run arbitrary programs (i.e. use subprocess).
What I have so far in my daemonizer.py module is:
import os
from multiprocessing import Process
from time import sleep
from subprocess import call, STDOUT
def _daemon_process(path_to_exec, std_out_path, args, shell):
with open(std_out_path, 'w') as fh:
args = (str(a) for a in args)
if shell:
fh.write("*** LAUNCHING IN SHELL: {0} ***\n\n".format(" ".join([path_to_exec] + list(args))))
retcode = call(" ".join([path_to_exec] + list(args)), stderr=STDOUT, stdout=fh, shell=True)
else:
fh.write("*** LAUNCHING WITHOUT SHELL: {0} ***\n\n".format([path_to_exec] + list(args)))
retcode = call([path_to_exec] + list(args), stderr=STDOUT, stdout=fh, shell=False)
if retcode:
fh.write("\n*** DAEMON EXITED WITH CODE {0} ***\n".format(retcode))
else:
fh.write("\n*** DAEMON DONE ***\n")
def daemon(path_to_executable, std_out=os.devnull, daemon_args=tuple(), shell=True):
d = Process(name='daemon', target=_daemon_process, args=(path_to_executable, std_out, daemon_args, shell))
d.daemon = True
d.start()
sleep(1)
When trying to run this in bash (This will create a file called test.log in your current directory.):
python -c"import daemonizer;daemonizer.daemon('ping', std_out='test.log', daemon_args=('-c', '5', '192.168.1.1'), shell=True)"
It correctly spawns a daemon that launches ping but it doesn't respect the arguments passed. This is true if shell is set to False as well. The log-file clearly states that it attempted to launch it with the arguments passed.
As a proof of concept creating the following executable:
echo "ping -c 5 192.168.1.1" > ping_test
chmod +x ping_test
The following works as intended:
python -c"import daemonizer;daemonizer.daemon('./ping_test', std_out='test.log', shell=True)"
If I test the same call code outside of the multiprocessing.Process-target it does work as expected.
So how do I fix this mess so that I can spawn processes with arguments?
I'm open to entirely different structures and modules, but they should be included among the standard ones and be compatible with python 2.7.x. The requirement is that the the daemon function should be callable several times asynchronously within a script and produce a daemon each and their target processes should be able to end up on different CPUs. Also the scripts need to be able to end without affecting the spawned daemons of course.
As a bonus, I noticed I needed to have a sleep for the spawning to work at all else the script terminates too fast. Any way to get around that arbitrary hack and/or how long do I really need to have it wait to be safe?

Your arguments are being "used up" by the printing of them!
First, you do this:
args = (str(a) for a in args)
That creates a generator, not a list or tuple. So when you later do this:
list(args)
That consumes the arguments, and they will not be seen a second time. So you do this again:
list(args)
And get an empty list!
You could fix this by commenting out your print statements, but much better would be to simply create a list in the first place:
args = [str(a) for a in args]
Then you can use args directly and not list(args). And it will always have the arguments inside.

Related

Python subprocess always waits for programm [duplicate]

I'm trying to port a shell script to the much more readable python version. The original shell script starts several processes (utilities, monitors, etc.) in the background with "&". How can I achieve the same effect in python? I'd like these processes not to die when the python scripts complete. I am sure it's related to the concept of a daemon somehow, but I couldn't find how to do this easily.
While jkp's solution works, the newer way of doing things (and the way the documentation recommends) is to use the subprocess module. For simple commands its equivalent, but it offers more options if you want to do something complicated.
Example for your case:
import subprocess
subprocess.Popen(["rm","-r","some.file"])
This will run rm -r some.file in the background. Note that calling .communicate() on the object returned from Popen will block until it completes, so don't do that if you want it to run in the background:
import subprocess
ls_output=subprocess.Popen(["sleep", "30"])
ls_output.communicate() # Will block for 30 seconds
See the documentation here.
Also, a point of clarification: "Background" as you use it here is purely a shell concept; technically, what you mean is that you want to spawn a process without blocking while you wait for it to complete. However, I've used "background" here to refer to shell-background-like behavior.
Note: This answer is less current than it was when posted in 2009. Using the subprocess module shown in other answers is now recommended in the docs
(Note that the subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using these functions.)
If you want your process to start in the background you can either use system() and call it in the same way your shell script did, or you can spawn it:
import os
os.spawnl(os.P_DETACH, 'some_long_running_command')
(or, alternatively, you may try the less portable os.P_NOWAIT flag).
See the documentation here.
You probably want the answer to "How to call an external command in Python".
The simplest approach is to use the os.system function, e.g.:
import os
os.system("some_command &")
Basically, whatever you pass to the system function will be executed the same as if you'd passed it to the shell in a script.
I found this here:
On windows (win xp), the parent process will not finish until the longtask.py has finished its work. It is not what you want in CGI-script. The problem is not specific to Python, in PHP community the problems are the same.
The solution is to pass DETACHED_PROCESS Process Creation Flag to the underlying CreateProcess function in win API. If you happen to have installed pywin32 you can import the flag from the win32process module, otherwise you should define it yourself:
DETACHED_PROCESS = 0x00000008
pid = subprocess.Popen([sys.executable, "longtask.py"],
creationflags=DETACHED_PROCESS).pid
Use subprocess.Popen() with the close_fds=True parameter, which will allow the spawned subprocess to be detached from the Python process itself and continue running even after Python exits.
https://gist.github.com/yinjimmy/d6ad0742d03d54518e9f
import os, time, sys, subprocess
if len(sys.argv) == 2:
time.sleep(5)
print 'track end'
if sys.platform == 'darwin':
subprocess.Popen(['say', 'hello'])
else:
print 'main begin'
subprocess.Popen(['python', os.path.realpath(__file__), '0'], close_fds=True)
print 'main end'
Both capture output and run on background with threading
As mentioned on this answer, if you capture the output with stdout= and then try to read(), then the process blocks.
However, there are cases where you need this. For example, I wanted to launch two processes that talk over a port between them, and save their stdout to a log file and stdout.
The threading module allows us to do that.
First, have a look at how to do the output redirection part alone in this question: Python Popen: Write to stdout AND log file simultaneously
Then:
main.py
#!/usr/bin/env python3
import os
import subprocess
import sys
import threading
def output_reader(proc, file):
while True:
byte = proc.stdout.read(1)
if byte:
sys.stdout.buffer.write(byte)
sys.stdout.flush()
file.buffer.write(byte)
else:
break
with subprocess.Popen(['./sleep.py', '0'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc1, \
subprocess.Popen(['./sleep.py', '10'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc2, \
open('log1.log', 'w') as file1, \
open('log2.log', 'w') as file2:
t1 = threading.Thread(target=output_reader, args=(proc1, file1))
t2 = threading.Thread(target=output_reader, args=(proc2, file2))
t1.start()
t2.start()
t1.join()
t2.join()
sleep.py
#!/usr/bin/env python3
import sys
import time
for i in range(4):
print(i + int(sys.argv[1]))
sys.stdout.flush()
time.sleep(0.5)
After running:
./main.py
stdout get updated every 0.5 seconds for every two lines to contain:
0
10
1
11
2
12
3
13
and each log file contains the respective log for a given process.
Inspired by: https://eli.thegreenplace.net/2017/interacting-with-a-long-running-child-process-in-python/
Tested on Ubuntu 18.04, Python 3.6.7.
You probably want to start investigating the os module for forking different threads (by opening an interactive session and issuing help(os)). The relevant functions are fork and any of the exec ones. To give you an idea on how to start, put something like this in a function that performs the fork (the function needs to take a list or tuple 'args' as an argument that contains the program's name and its parameters; you may also want to define stdin, out and err for the new thread):
try:
pid = os.fork()
except OSError, e:
## some debug output
sys.exit(1)
if pid == 0:
## eventually use os.putenv(..) to set environment variables
## os.execv strips of args[0] for the arguments
os.execv(args[0], args)
You can use
import os
pid = os.fork()
if pid == 0:
Continue to other code ...
This will make the python process run in background.
I haven't tried this yet but using .pyw files instead of .py files should help. pyw files dosen't have a console so in theory it should not appear and work like a background process.

Printing all shell commands fired from python script, onto console before executing

I have a moderately large python script that executes a lot of shell commands from within itself. I would like to print all these commands to screen before executing it, so that I can trace the control flow of the python script.
Now, the script uses multiple commands to actually execute shell commands, including system() from os module, and call(), popen() and check_output() from subprocess module. What might be the easiest way to do this?
I was thinking of a wrapper function that prints the shell command argument before executing it, but I don't know how to write a generic one that can call the correct call/Popen or other command as per user discretion. And we also have to keep in mind that these calls take in different number and type of arguments.
Thanks!
Build and install a decorator for the various calls you wish to intercept. Because even the "builtin" functions are first-class objects, you can replace them with logging versions:
import os
def log_wrap(f):
def logging_wrapper(*args, **kwargs):
print("Calling {} with args: {!r}, kwargs: {!r}".format(f.__name__, args, kwargs))
return f(*args, **kwargs)
return logging_wrapper
os.system("echo 'Hello, world!'")
os.system = log_wrap(os.system)
os.system("echo 'How do you like me now?!'")
This code, when run (python3) prints:
$ python test.py
Hello, world!
Calling system with args: ("echo 'How do you like me now?!'",), kwargs: {}
How do you like me now?!
Note that between the first and second calls to os.system, I replaced the os.system function with one that prints a log message before passing the arguments along to the original function. You can print your log message to a file, or (better yet) call the logging module, or invoke the pdb debugger, or whatever you like...
Since you want to start a child process, but log your behavior, you will need to record child process stdout, stderr,
import subprocess as sp
import datetime
dt=datetime.datetime.now().strftime("%Y%m%d%H%M00")
ofd = open("res.sql","w+")
efd = open("log/cmd-{}.log".format(dt),"a+")
Suppose you are constructing a command (example, mysqldump) and you have the database, table, and credentials filenames (db, tbl, cnf) loaded. You want to print the command you are about to executed,
args = ["mysqldump","--defaults-extra-file="+cnf,"--single-transaction","--quick",db,tbl]
print " ".join(args)
Now, assume you have opened output and error files above (ofd, efd),
proc = sp.Popen(args, shell=False, stdin=sp.PIPE, stdout=ofd, stderr=efd)
stdout,stderr = proc.communicate
rc = proc.wait()
if rc>0: print "error",rc,"dbdump failed"
else: print "result",stdout,stderr
Remember to close ofd, efd.
import commands
commands = r'''find .'''
result = commands.getstatusoutput(command)[0]
print("Command: {}\nresult: {}".format(command,result))

How to open a new command prompt with subprocess.run? [duplicate]

I'm trying to make a non blocking subprocess call to run a slave.py script from my main.py program. I need to pass args from main.py to slave.py once when it(slave.py) is first started via subprocess.call after this slave.py runs for a period of time then exits.
main.py
for insert, (list) in enumerate(list, start =1):
sys.args = [list]
subprocess.call(["python", "slave.py", sys.args], shell = True)
{loop through program and do more stuff..}
And my slave script
slave.py
print sys.args
while True:
{do stuff with args in loop till finished}
time.sleep(30)
Currently, slave.py blocks main.py from running the rest of its tasks, I simply want slave.py to be independent of main.py, once I've passed args to it. The two scripts no longer need to communicate.
I've found a few posts on the net about non blocking subprocess.call but most of them are centered on requiring communication with slave.py at some-point which I currently do not need. Would anyone know how to implement this in a simple fashion...?
You should use subprocess.Popen instead of subprocess.call.
Something like:
subprocess.Popen(["python", "slave.py"] + sys.argv[1:])
From the docs on subprocess.call:
Run the command described by args. Wait for command to complete, then return the returncode attribute.
(Also don't use a list to pass in the arguments if you're going to use shell = True).
Here's a MCVE1 example that demonstrates a non-blocking suprocess call:
import subprocess
import time
p = subprocess.Popen(['sleep', '5'])
while p.poll() is None:
print('Still sleeping')
time.sleep(1)
print('Not sleeping any longer. Exited with returncode %d' % p.returncode)
An alternative approach that relies on more recent changes to the python language to allow for co-routine based parallelism is:
# python3.5 required but could be modified to work with python3.4.
import asyncio
async def do_subprocess():
print('Subprocess sleeping')
proc = await asyncio.create_subprocess_exec('sleep', '5')
returncode = await proc.wait()
print('Subprocess done sleeping. Return code = %d' % returncode)
async def sleep_report(number):
for i in range(number + 1):
print('Slept for %d seconds' % i)
await asyncio.sleep(1)
loop = asyncio.get_event_loop()
tasks = [
asyncio.ensure_future(do_subprocess()),
asyncio.ensure_future(sleep_report(5)),
]
loop.run_until_complete(asyncio.gather(*tasks))
loop.close()
1Tested on OS-X using python2.7 & python3.6
There's three levels of thoroughness here.
As mgilson says, if you just swap out subprocess.call for subprocess.Popen, keeping everything else the same, then main.py will not wait for slave.py to finish before it continues. That may be enough by itself. If you care about zombie processes hanging around, you should save the object returned from subprocess.Popen and at some later point call its wait method. (The zombies will automatically go away when main.py exits, so this is only a serious problem if main.py runs for a very long time and/or might create many subprocesses.) And finally, if you don't want a zombie but you also don't want to decide where to do the waiting (this might be appropriate if both processes run for a long and unpredictable time afterward), use the python-daemon library to have the slave disassociate itself from the master -- in that case you can continue using subprocess.call in the master.
For Python 3.8.x
import shlex
import subprocess
cmd = "<full filepath plus arguments of child process>"
cmds = shlex.split(cmd)
p = subprocess.Popen(cmds, start_new_session=True)
This will allow the parent process to exit while the child process continues to run. Not sure about zombies.
Tested on Python 3.8.1 on macOS 10.15.5
The easiest solution for your non-blocking situation would be to add & at the end of the Popen like this:
subprocess.Popen(["python", "slave.py", " &"])
This does not block the execution of the rest of the program.
If you want to start a function several times with different arguments in a non-blocking way, you can use the ThreadPoolExecuter.
You submit your function calls to the executer like this
from concurrent.futures import ThreadPoolExecutor
def threadmap(fun, xs):
with ThreadPoolExecutor(max_workers=8) as executer:
return list(executer.map(fun, xs))

Forking a process in python

I have updated the question to be more clear
I want to execute function while printing numbers in background and check the condition.
import time
number = [1,100]
t0 = time.time()
for i in number:
print i
t1= time.time()
def sum_two_numbers():
t2 = time.time()
c=1+2
t3 =time.time()
verify t0<t2 and t3<<t1
As the two scripts are completely independent, just use subprocess.Popen():
import subprocess
script1 = subprocess.Popen(['/path/to/script1', 'arg1', 'arg2', 'etc'])
script2 = subprocess.Popen(['/path/to/script2', 'arg1', 'arg2', 'etc'])
That's it, both scripts are running in the background1. If you want to wait for one of them to complete, call script1.wait() or script2.wait() as appropriate. Example:
import subprocess
script1 = subprocess.Popen(['sleep', '30'])
script2 = subprocess.Popen(['ls', '-l'])
script1.wait()
You will find that script 2 will produce its output and terminate before script 1.
If you need to capture the output of either of the child processes then you will need to use pipes, and then things get more complicated.
1 Here "background" is distinct from the usual *nix notion of a background process running in a shell; there is no job control for example. WRT subprocess, a new child process is simply being created and the requested executable loaded. No shell is involved, provided that shell=False as per the default Popen() option.

Sending multiple commands to a bash shell which must share an environment

I am attempting to follow this answer here: https://stackoverflow.com/a/5087695/343381
I have a need to execute multiple bash commands within a single environment. My test case is simple:
import subprocess
cmd = subprocess.Popen(['bash'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
# Write the first command
command = "export greeting=hello\n"
cmd.stdin.write(command)
cmd.stdin.flush() # Must include this to ensure data is passed to child process
result = cmd.stdout.read()
print result
# Write the second command
command = "echo $greeting world\n"
cmd.stdin.write(command)
cmd.stdin.flush() # Must include this to ensure data is passed to child process
result = cmd.stdout.read()
print result
What I expected to happen (based on the referenced answer) is that I see "hello world" printed. What actually happens is that it hangs on the first cmd.stdout.read(), and never returns.
Can anyone explain why cmd.stdout.read() never returns?
Notes:
It is absolutely essential that I run multiple bash commands from python within the same environment. Thus, subprocess.communicate() does not help because it waits for the process to terminate.
Note that in my real test case, it is not a static list of bash commands to execute. The logic is more dynamic. I don't have the option of running all of them at once.
You have two problems here:
Your first command does not produce any output. So the first read blocks waiting for some.
You are using read() instead of readline() -- read() will block until enough data is available.
The following modified code (updated with Martjin's polling suggestion) works fine:
import subprocess
import select
cmd = subprocess.Popen(['bash'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
poll = select.poll()
poll.register(cmd.stdout.fileno(),select.POLLIN)
# Write the first command
command = "export greeting=hello\n"
cmd.stdin.write(command)
cmd.stdin.flush() # Must include this to ensure data is passed to child process
ready = poll.poll(500)
if ready:
result = cmd.stdout.readline()
print result
# Write the second command
command = "echo $greeting world\n"
cmd.stdin.write(command)
cmd.stdin.flush() # Must include this to ensure data is passed to child process
ready = poll.poll(500)
if ready:
result = cmd.stdout.readline()
print result
The above has a 500ms timeout - adjust to your needs.

Categories