Executing Python script using subprocess behaves differently on Ubuntu and Windows machines - python

Could you help to check what's going on with subprocess, it performs differently on different machines with same Python version, but one is on Ubuntu docker and one is on Windows.
Ubuntu docker
I use subprocess to execute an external Python script with parameter shell=True, actually it opens a new process for me without executing the specified script, so I have to remove the parameter shell=True and then everything works as expected.
You can see from the screenshot below, I need to exit() after executing the first subprocess, and ran the second subprocess without shell=True.
Windows
In Windows, shell=True works same as I execute subprocess in Ubuntu without shell=True parameter.

Quoting https://docs.python.org/3/library/subprocess.html#popen-constructor:
On POSIX with shell=True, the shell defaults to /bin/sh. If args is a
string, the string specifies the command to execute through the shell.
This means that the string must be formatted exactly as it would be
when typed at the shell prompt. This includes, for example, quoting or
backslash escaping filenames with spaces in them. If args is a
sequence, the first item specifies the command string, and any
additional items will be treated as additional arguments to the shell
itself.
(emphasis mine)
That means, in your first example with run(['python', 'script.py'], shell=True) you are actually only starting an interactive Python session and not passing the script to the interpreter.
Further:
The only time you need to specify shell=True on Windows is when the
command you wish to execute is built into the shell (e.g. dir or
copy). You do not need shell=True to run a batch file or console-based
executable.
Conclusion: Whenever possible, pass the arguments as a list (as you did), but do not use shell=True.

Related

Python, the relationship between the bash/python/subprocess processes (shells)?

When trying to write script with python, I have a fundamental hole of knowledge.
Update: Thanks to the answers I corrected the word shell to process/subprocess
Nomenclature
Starting with a Bash prompt, lets call this BASH_PROCESS
Then within BASH_PROCESS I run python3 foo.py, the python script runs in say PYTHON_SUBPROCESS
Within foo.py is a call to subprocess.run(...), this subprocess command runs in say `SUBPROCESS_SUBPROCESS
Within foo.py is subprocess.run(..., shell=True), this subprocess command runs in say SUBPROCESS_SUBPROCESS=True
Test for if a process/subprocess is equal
Say SUBPROCESS_A starts SUBPROCESS_B. In the below questions, when I say is SUBPROCESS_A == SUBPROCESS_B, what I means is if SUBPROCESS_B sets an env variable, when it runs to completion, will they env variable be set in SUBPROCESS_A? If one runs eval "$(ssh-agent -s)" in SUBPROCESS_B, will SUBPROCESS_A now have an ssh agent too?
Question
Using the above nomenclature and equality tests
Is BASH_PROCESS == PYTHON_SUBPROCESS?
Is PYTHON_SUBPROCESS == SUBPROCESS_SUBPROCESS?
Is PYTHON_SUBPROCESS == SUBPROCESS_SUBPROCESS=True?
If SUBPROCESS_SUBPROCESS=True is not equal to BASH_PROCESS, then how does one alter the executing environment (e.g. eval "$(ssh-agent -s)") so that a python script can set up the env for the calller?
You seem to be confusing several concepts here.
TLDR No, there is no way for a subprocess to change its parent's environment. See also Global environment variables in a shell script
You really don't seem to be asking about "shells".
Instead, these are subprocesses; if you run python foo.py in a shell, the Python process is a subprocess of the shell process. (Many shells let you exec python foo.py which replaces the shell process with a Python process; this process is now a subprocess of whichever process started the shell. On Unix-like systems, ultimately all processes are descendants of process 1, the init process.)
subprocess runs a subprocess, simply. If shell=True then the immediate subprocess of Python is the shell, and the command(s) you run are subprocesses of that shell. The shell will be the default shell (cmd on Windows, /bin/sh on Unix-like systems) though you can explicitly override this with e.g. executable="/bin/bash"
Examples:
subprocess.Popen(['printf', '%s\n', 'foo', 'bar'])
Python is the parent process, printf is a subprocess whose parent is the Python process.
subprocess.Popen(r"printf '%s\n' foo bar", shell=True)
Python is the parent process of /bin/sh, which in turn is the parent process of printf. When printf terminates, so does sh, as it has reached the end of its script.
Perhaps notice that the shell takes care of parsing the command line and splitting it up into the four tokens we ended up explicitly passing directly to Popen in the previous example.
The commands you run have access to shell features like wildcard expansion, pipes, redirection, quoting, variable expansion, background processing, etc.
In this isolated example, none of those are used, so you are basically adding an unnecessary process. (Maybe use shlex.split() if you want to avoid the minor burden of splitting up the command into tokens.) See also Actual meaning of 'shell=True' in subprocess
subprocess.Popen(r"printf '%s\n' foo bar", shell=True, executable="/bin/bash")
Python is the parent process of Bash, which in turn is the parent process of printf. Except for the name of the shell, this is identical to the previous example.
There are situations where you really need the slower and more memory-hungry Bash shell, when the commands you want to execute require features which are available in Bash, but not in the Bourne shell. In general, a better solution is nearly always to run as little code as possible in a subprocess, and instead replace those Bash commands with native Python constructs; but if you know what you are doing (or really don't know what you are doing, but need to get the job done rather than solve the problem properly), the facility can be useful.
(Separately, you should probably avoid bare Popen when you can, as explained in the subprocess documentation.)
Subprocesses inherit the environment of their parent when they are started. On Unix-like systems, there is no way for a process to change its parent's environment (though the parent may participate in making that possible, as in your eval example).
To perhaps accomplish what you may ultimately be asking about, you can set up an environment within Python and then start your other command as a subprocess, perhaps then with an explicit env= keyword argument to point to the environment you want it to use:
import os
...
env = os.environ.copy()
env["PATH"] = "/opt/foo:" + env["PATH"]
del env["PAGER"]
env["secret_cookie"] = "xyzzy"
subprocess.Popen(["otherprogram"], env=env)
or have Python print out values in a form which can safely be passed to eval in the Bourne shell. (Caution: this requires you to understand the perils of eval in general and the target shell's quoting conventions in particular; also, you will perhaps need to support the syntax of more than one shell, unless you are only targeting a very limited audience.)
... Though in many situations, the simplest solution by far is to set up the environment in the shell, then run Python as a subprocess of that shell instance (or exec python if you want to get rid of the shell instance after it has performed its part; see also What are the uses of the exec command in shell scripts?)
Python without an argument starts the Python REPL, which could be regarded as a "shell", though we would commonly not use that term (perhaps instead call it "interactive interpreter" - see also below); but python foo.py simply runs the script foo.py and exits, so there is no shell there.
The definition of "shell" is slightly context-dependent, but you don't really seem to be asking about shells here. (Some GUIs have a concept of "graphical shell" etc but we are already out of the scope of what you were trying to ask about.) Some programs are command interpreters (the Python executable interprets and executes commands in the Python language; the Bourne shell interprets and executes shell scripts) but generally only those whose primary purposes include running other programs are called "shells".
None of those equalities are true, and half of those "shells" aren't actually shells.
Your bash shell is a shell. When you launch your Python script from that shell, the Python process that runs the script is a child process of the bash shell process. When you launch a subprocess from the Python script, that subprocess is a child process of the Python process. If you launch the subprocess with shell=True, Python invokes a shell to parse and run the command, but otherwise, no shell is involved in running the subprocess.
Child processes inherit environment variables from their parent on startup (unless you take specific steps to avoid that), but they cannot set environment variables for their parent. You cannot run a Python script to set environment variables in your shell, or run a subprocess from Python to set your Python script's environment variables.

How do I tell POpen to use/set certain environment variables?

I'm using Python 3.7 and Django. I use the below to run a command I would normally run in the shell ...
out = Popen([settings.SELENIUM_RUNNER_CMD, file_path], stderr=STDOUT, stdout=PIPE)
t = out.communicate()[0], out.returncode
his dies with the error
b'env: node: No such file or directory\n'
What I'm trying to figure out is how to give my Python environment access to the normal environment variables I have access to, or figure out a way to set them before I run my Python command. Normally "node" is easily found when I check as myself
davea$ which node
/usr/local/bin/node
But I don't know how to tell Python to use the same PATH that I have access to.
If we refer to Popen's documentation, we can see three relevant arguments:
cwd str or path-like object, that's the current working directory
env mapping (let's say a dict), that's the environment mapping passed to the called program
shell flag, whether you wrap the program inside of a shell or not
Let's review each solution.
If you can afford it, just use cwd="where is node", for instance, if node is in /usr/local/bin, you can just use cwd=/usr/local/bin or cwd=os.path.join(USR_LOCAL, 'bin') for example. But, everything will be created in this folder, which might not be what you wish for (logs, assumptions on the current working directory).
Now, for the environment:
If env is not None, it must be a mapping that defines the environment variables for the new process; these are used instead of the default behavior of inheriting the current process’ environment. It is passed directly to Popen.
You can just copy your current environment through os.environ and add something in the PATH like this:
new_env = os.environ.copy()
new_env['PATH'] = '{}:/usr/local/bin'.format(new_env['PATH'])
Then pass this new_env mapping and there you are!
If you really want to rely on shell, you can, but here's the platform-details:
POSIX-platforms
On POSIX with shell=True, the shell defaults to /bin/sh. If args is a string, the string specifies the command to execute through the shell. This means that the string must be formatted exactly as it would be when typed at the shell prompt. This includes, for example, quoting or backslash escaping filenames with spaces in them. If args is a sequence, the first item specifies the command string, and any additional items will be treated as additional arguments to the shell itself. That is to say, Popen does the equivalent of: Popen(['/bin/sh', '-c', args[0], args[1], ...])
Windows platforms
On Windows with shell=True, the COMSPEC environment variable specifies the default shell. The only time you need to specify shell=True on Windows is when the command you wish to execute is built into the shell (e.g. dir or copy). You do not need shell=True to run a batch file or console-based executable.
You can use something like PATH=whatever and use your whole shell-fu directly, but caveats are: security considerations.
Bonus solution
Just re-define PATH before calling your Python process. If you're using Django, you're either using:
The development server
A production-grade server
In both cases, all you have to do is to re-define the environment of the parent process, for a production-grade server such as Gunicorn, this is possible and there is documentation to do it. For a development server, do it at your own shell level (but warning! You might have to document such a behavior or tell anyone using your software you're assuming node is in the path which is… most of the time fair).
os.environ.copy() is best for what you're looking for.
import subprocess, os
my_env = os.environ.copy()
out = Popen([settings.SELENIUM_RUNNER_CMD, file_path], stderr=STDOUT, stdout=PIPE, env=my_env)
t = out.communicate()[0], out.returncode
And that should be it!

Python execute windows cmd functions

I know you can run Linux terminal commands through Python scripts using subprocess
subprocess.call(['ls', '-l']) # for linux
But I can't find a way to do the same thing on windows
subprocess.call(['dir']) # for windows
is it possible using Python without heavy tinkering?
Should I stick to good old fashioned batch files?
dir is not a file, it is an internal command, so the shell keyword must be set to True.
subprocess.call(["dir"], shell=True)
Try this
import os
os.system("windows command")
ex: for date
os.system("date")
Almost everyone's answers are right but it seems I can do what I need using os.popen -- varStr = os.popen('dir /b *.py').read()
First of all, to get a directory listing, you should rather use os.listdir(). If you invoke dir instead, you'll have to parse its output to make any use of it, which is lots of unnecessary work and is error-prone.
Now,
dir is a cmd.exe built-in command, it's not a standalone executable. cmd.exe itself is the executable that implements it.
So, you have two options (use check_output instead of check_call if you need to get the output instead of just printing it):
use cmd's /C switch (execute a command and quit):
subprocess.check_call(['cmd','/c','dir','/s'])
use shell=True Popen() option (execute command line through the system shell):
subprocess.check_call('dir /s', shell=True)
The first way is the recommended one. That's because:
In the 2nd case, cmd, will do any shell transformations that it normally would (e.g. splitting the line into arguments, unquoting, environment variable expansion etc). So, your arguments may suddenly become something else and potentially harmful. In particular, if they happen to contain any spaces and cmd special characters and/or keywords.
shell=True uses the "default system shell" (pointed to via COMSPEC environment variable in the case of Windows), so if the user has redefined it, your program will behave unexpectedly.

subprocess.Popen gives random result

I wrote a simple piece of code:
import subprocess
p=subprocess.Popen('mkdir -p ./{a,b,c}', shell=True, stderr=subprocess.STDOUT)
p.wait()
Unfortunately, it not always behaves the way I'd expect. I.e, when I run it on my PC, everything is OK (ls -l gives me three dirs: a, b and c). But when my colleague runs it on his desktop, he gets... one dir named: '{a,b,c}' ... We both use Python 2.7.3. Why is that? How would you fix it?
I tried to find the answer by myself. According to manual:
"args should be a sequence of program arguments or else a single string. By default, the program to execute is the first item in args if args is a sequence. If args is a string, the interpretation is platform-dependent and described below. See the shell and executable arguments for additional differences from the default behavior. Unless otherwise stated, it is recommended to pass args as a sequence."
So I tried to execute the code in shell:
python -c "import subprocess; p=subprocess.Popen(['mkdir', '-p', './{ea,fa,ga}'], shell=True, stderr=subprocess.STDOUT); p.wait()"
And I got:
mkdir: missing operand
I will be thankful for any advice
Thanks!
The ./{a,b,c} syntax is bash syntax, not supported by all shells.
The documentation says:
On Unix with shell=True, the shell defaults to /bin/sh. If args is a
string, the string specifies the command to execute through the shell.
So your command only works if /bin/sh is symlinked to a shell that supports that syntax, like bash or zsh. Your colleague is probably using dash or another shell that doesn't support this.
You should no be relying in something like a user's default shell. Instead, write the full command with the full expansion:
p = subprocess.Popen('mkdir -p ./a ./b ./c', shell=True, stderr=subprocess.STDOUT)
There's several problems here.
First: if you are using a sequence of arguments, do not set "shell = True" (this is recommended in the Popen manual). Set it to False, and you'll see that your mkdir command will be accepted.
"./{a,b,c}" is AFAIK a specific syntax in bash. If your colleague is using a different shell, it will probably not work, or behave differently.
You should use the python "mkdir" command instead of calling a shell command, it will work whatever the server / shell / OS.
Thank you all for your answers.
It seems, that the best way is simply use /bin/sh syntax. I changed my code to use:
'mkdir -p ./a ./b ./c'
as you suggested.
I avoided to use mkdir() function, because I am writing a scripts with plenty of system calls, and I wanted to provide elegant --dry-run option (so I could list all of the commands).
Problem solved - thank you!
The os.mkdir(path,[mode]) method are as far as I understand safer to use when working on multiplatform projects.
os.mkdir(os.getcwd()/a)
However its not as elegant as taking the subprocess approach.

Cannot make consecutive calls with subprocess

I'm having trouble using mutilple subprocess calls back to back.
These 2 work fine:
subprocess.call(["gmake", "boot-tilera"], cwd="/home/ecorbett/trn_fp")
p = subprocess.Popen(["gmake", "run-tilera"], stdout=subprocess.PIPE, cwd="/home/ecorbett/trn_fp")
However, I get an error when I try to run this call directly after:
time.sleep(10)
subprocess.call(["./go2.sh"], cwd="/home/ecorbett/trn_fp/kem_ut")
I added sleep in there because I need a few seconds before I run the "./go2.sh" program. Not sure if that is the issue.
Any advice?
A possible reason why your shell script is working on the command-line is that the shebang line was not written correctly (or not written at all). See an example in which the script would work from a command line but not as a Python subprocess: Is this the right way to run a shell script inside Python?
If your shell script did not have a shebang line specified, it would work from command line because $SHELL is set in your environment and the script is taking that as a default. When running from a python subprocess, python does not know what it is and fails with OSError: [Errno 8] Exec format error. The subprocess.call() to gmake worked because it is a binary program and not a shell script. Using the argument shell=True gave an instruction to interpret the argument exactly as it would in a shell.
However, be careful about using shell=True in subprocess.call() as it may be insecure in some cases: subprocess Python docs.

Categories