Python, the relationship between the bash/python/subprocess processes (shells)?

Python, the relationship between the bash/python/subprocess processes (shells)? - python

When trying to write script with python, I have a fundamental hole of knowledge.
Update: Thanks to the answers I corrected the word shell to process/subprocess
Nomenclature
Starting with a Bash prompt, lets call this BASH_PROCESS
Then within BASH_PROCESS I run python3 foo.py, the python script runs in say PYTHON_SUBPROCESS
Within foo.py is a call to subprocess.run(...), this subprocess command runs in say `SUBPROCESS_SUBPROCESS
Within foo.py is subprocess.run(..., shell=True), this subprocess command runs in say SUBPROCESS_SUBPROCESS=True
Test for if a process/subprocess is equal
Say SUBPROCESS_A starts SUBPROCESS_B. In the below questions, when I say is SUBPROCESS_A == SUBPROCESS_B, what I means is if SUBPROCESS_B sets an env variable, when it runs to completion, will they env variable be set in SUBPROCESS_A? If one runs eval "$(ssh-agent -s)" in SUBPROCESS_B, will SUBPROCESS_A now have an ssh agent too?
Question
Using the above nomenclature and equality tests
Is BASH_PROCESS == PYTHON_SUBPROCESS?
Is PYTHON_SUBPROCESS == SUBPROCESS_SUBPROCESS?
Is PYTHON_SUBPROCESS == SUBPROCESS_SUBPROCESS=True?
If SUBPROCESS_SUBPROCESS=True is not equal to BASH_PROCESS, then how does one alter the executing environment (e.g. eval "$(ssh-agent -s)") so that a python script can set up the env for the calller?

You seem to be confusing several concepts here.
TLDR No, there is no way for a subprocess to change its parent's environment. See also Global environment variables in a shell script
You really don't seem to be asking about "shells".
Instead, these are subprocesses; if you run python foo.py in a shell, the Python process is a subprocess of the shell process. (Many shells let you exec python foo.py which replaces the shell process with a Python process; this process is now a subprocess of whichever process started the shell. On Unix-like systems, ultimately all processes are descendants of process 1, the init process.)
subprocess runs a subprocess, simply. If shell=True then the immediate subprocess of Python is the shell, and the command(s) you run are subprocesses of that shell. The shell will be the default shell (cmd on Windows, /bin/sh on Unix-like systems) though you can explicitly override this with e.g. executable="/bin/bash"
Examples:
subprocess.Popen(['printf', '%s\n', 'foo', 'bar'])
Python is the parent process, printf is a subprocess whose parent is the Python process.
subprocess.Popen(r"printf '%s\n' foo bar", shell=True)
Python is the parent process of /bin/sh, which in turn is the parent process of printf. When printf terminates, so does sh, as it has reached the end of its script.
Perhaps notice that the shell takes care of parsing the command line and splitting it up into the four tokens we ended up explicitly passing directly to Popen in the previous example.
The commands you run have access to shell features like wildcard expansion, pipes, redirection, quoting, variable expansion, background processing, etc.
In this isolated example, none of those are used, so you are basically adding an unnecessary process. (Maybe use shlex.split() if you want to avoid the minor burden of splitting up the command into tokens.) See also Actual meaning of 'shell=True' in subprocess
subprocess.Popen(r"printf '%s\n' foo bar", shell=True, executable="/bin/bash")
Python is the parent process of Bash, which in turn is the parent process of printf. Except for the name of the shell, this is identical to the previous example.
There are situations where you really need the slower and more memory-hungry Bash shell, when the commands you want to execute require features which are available in Bash, but not in the Bourne shell. In general, a better solution is nearly always to run as little code as possible in a subprocess, and instead replace those Bash commands with native Python constructs; but if you know what you are doing (or really don't know what you are doing, but need to get the job done rather than solve the problem properly), the facility can be useful.
(Separately, you should probably avoid bare Popen when you can, as explained in the subprocess documentation.)
Subprocesses inherit the environment of their parent when they are started. On Unix-like systems, there is no way for a process to change its parent's environment (though the parent may participate in making that possible, as in your eval example).
To perhaps accomplish what you may ultimately be asking about, you can set up an environment within Python and then start your other command as a subprocess, perhaps then with an explicit env= keyword argument to point to the environment you want it to use:
import os
...
env = os.environ.copy()
env["PATH"] = "/opt/foo:" + env["PATH"]
del env["PAGER"]
env["secret_cookie"] = "xyzzy"
subprocess.Popen(["otherprogram"], env=env)
or have Python print out values in a form which can safely be passed to eval in the Bourne shell. (Caution: this requires you to understand the perils of eval in general and the target shell's quoting conventions in particular; also, you will perhaps need to support the syntax of more than one shell, unless you are only targeting a very limited audience.)
... Though in many situations, the simplest solution by far is to set up the environment in the shell, then run Python as a subprocess of that shell instance (or exec python if you want to get rid of the shell instance after it has performed its part; see also What are the uses of the exec command in shell scripts?)
Python without an argument starts the Python REPL, which could be regarded as a "shell", though we would commonly not use that term (perhaps instead call it "interactive interpreter" - see also below); but python foo.py simply runs the script foo.py and exits, so there is no shell there.
The definition of "shell" is slightly context-dependent, but you don't really seem to be asking about shells here. (Some GUIs have a concept of "graphical shell" etc but we are already out of the scope of what you were trying to ask about.) Some programs are command interpreters (the Python executable interprets and executes commands in the Python language; the Bourne shell interprets and executes shell scripts) but generally only those whose primary purposes include running other programs are called "shells".

None of those equalities are true, and half of those "shells" aren't actually shells.
Your bash shell is a shell. When you launch your Python script from that shell, the Python process that runs the script is a child process of the bash shell process. When you launch a subprocess from the Python script, that subprocess is a child process of the Python process. If you launch the subprocess with shell=True, Python invokes a shell to parse and run the command, but otherwise, no shell is involved in running the subprocess.
Child processes inherit environment variables from their parent on startup (unless you take specific steps to avoid that), but they cannot set environment variables for their parent. You cannot run a Python script to set environment variables in your shell, or run a subprocess from Python to set your Python script's environment variables.

Related

How to export an environment variable from a Python script? [duplicate]

In Linux When I invoke python from the shell it replicates its environment, and starts the python process. Therefore if I do something like the following:
import os
os.environ["FOO"] = "A_Value"
When the python process returns, FOO, assuming it was undefined originally, will still be undefined. Is there a way for the python process (or any child process) to modify the environment of its parent process?
I know you typically solve this problem using something like
source script_name.sh
But this conflicts with other requirements I have.

No process can change its parent process (or any other existing process' environment).
You can, however, create a new environment by creating a new interactive shell with the modified environment.
You have to spawn a new copy of the shell that uses the upgraded environment and has access to the existing stdin, stdout and stderr, and does its reinitialization dance.
You need to do something like use subprocess.Popen to run /bin/bash -i.
So the original shell runs Python, which runs a new shell. Yes, you have a lot of processes running. No it's not too bad because the original shell and Python aren't really doing anything except waiting for the subshell to finish so they can exit cleanly, also.

It's not possible, for any child process, to change the environment of the parent process. The best you can do is to output shell statements to stdout that you then source, or write it to a file that you source in the parent.

I would use the bash eval statement, and have the python script output the shell code
child.py:
#!/usr/bin/env python
print 'FOO="A_Value"'
parent.sh
#!/bin/bash
eval `./child.py`

I needed something similar, I ended up creating a script envtest.py with:
import sys, os
sys.stdout = open(os.devnull, 'w')
# Python code with any number of prints (to stdout).
print("This is some other logic, which shouldn't pollute stdout.")
sys.stdout = sys.__stdout__
print("SomeValue")
Then in bash:
export MYVAR=$(python3 envtest.py)
echo "MYVAR is $MYVAR"
Which echos the expected: MYVAR is SomeValue

How to run python code in the currently activated virtualenv with subprocess?

I have a virtualenv named 'venv' and it is activate:
(venv)>
and I wrote codes that I'll run it in the virtualenv (main.py):
import subprocess
result = subprocess.run('python other.py', stdout=subprocess.PIPE)
but when I run main.py file:
(venv)> python main.py
subprocess does not execute the command (python other.py) in the virtualenv i.e venv
How to run subprocess command in the current virtualenv session?

A child process can't run commands in its parent process without that process's involvement.
This is why ssh-agent requires usage as eval "$(ssh-agent -s)" to invoke the shell commands it emits on output, for example. Thus, the literal thing you're asking for here is impossible.
Fortunately, it's also unnecessary.
virtualenvs use environment variables inherited by child processes.
This means that you don't actually need to use the same shell that has a virtualenv activated to start a new Python interpreter intended to use the interpreter/libraries/etc. from that virtualenv.
subprocess.run must be passed a list, or shell=True must be used.
Either do this (which is better!)
import subprocess
result = subprocess.run(['python', 'other.py'], stdout=subprocess.PIPE)
Or this (which is worse!)
import subprocess
result = subprocess.run('python other.py', stdout=subprocess.PIPE, shell=True)

If you want to run a script with the same Python executable being used to run the current script, don't use python and rely on the path being set up properly, just use sys.executable:
A string giving the absolute path of the executable binary for the Python interpreter, on systems where this makes sense.
This works if you executed the script with python myscript.py relying on the active virtualenv's PATH. It also works if you executed the script with /usr/local/bin/python3.6 to ignore the PATH and test your script with a specific interpreter. Or if you executed the script with myscript.py, relying on a shbang line created at installation time by setuptools. Or if the script was run as a CGI depending on your Apache configuration. Or if you sudod the executable, or did something else that scraped down your environment. Or almost anything else imaginable.1
As explained in Charles Duffy's answer, you still need to use a list of arguments instead of a string (or use shell=True, but you rarely want to do that). So:
result = subprocess.run([sys.executable, 'other.py'], stdout=subprocess.PIPE)
1. Well, not quite… Examples of where it doesn't work include custom C programs that embed a CPython interpreter, some smartphone mini-Python environments, old-school Amiga Python, … The one most likely to affect you—and it's a pretty big stretch—is that on some *nix platforms, if you write a program that execs Python by passing incompatible names for the process and arg0, sys.executable can end up wrong.

Python execute windows cmd functions

I know you can run Linux terminal commands through Python scripts using subprocess
subprocess.call(['ls', '-l']) # for linux
But I can't find a way to do the same thing on windows
subprocess.call(['dir']) # for windows
is it possible using Python without heavy tinkering?
Should I stick to good old fashioned batch files?

dir is not a file, it is an internal command, so the shell keyword must be set to True.
subprocess.call(["dir"], shell=True)

Try this
import os
os.system("windows command")
ex: for date
os.system("date")

Almost everyone's answers are right but it seems I can do what I need using os.popen -- varStr = os.popen('dir /b *.py').read()

First of all, to get a directory listing, you should rather use os.listdir(). If you invoke dir instead, you'll have to parse its output to make any use of it, which is lots of unnecessary work and is error-prone.
Now,
dir is a cmd.exe built-in command, it's not a standalone executable. cmd.exe itself is the executable that implements it.
So, you have two options (use check_output instead of check_call if you need to get the output instead of just printing it):
use cmd's /C switch (execute a command and quit):
subprocess.check_call(['cmd','/c','dir','/s'])
use shell=True Popen() option (execute command line through the system shell):
subprocess.check_call('dir /s', shell=True)
The first way is the recommended one. That's because:
In the 2nd case, cmd, will do any shell transformations that it normally would (e.g. splitting the line into arguments, unquoting, environment variable expansion etc). So, your arguments may suddenly become something else and potentially harmful. In particular, if they happen to contain any spaces and cmd special characters and/or keywords.
shell=True uses the "default system shell" (pointed to via COMSPEC environment variable in the case of Windows), so if the user has redefined it, your program will behave unexpectedly.

Change parent shell's environment from a subprocess

If I have a program written in a language other than bash (say python), how can I change environment variables or the current working directory inside it such that it reflects in the calling shell?
I want to use this to write a 'command line helper' that simplifies common operations. For example, a smart cd. When I simple type in the name of a directory into my prompt, it should cd into it.
[~/]$ Downloads
[~/Downloads]$
or even
[~/]$ project5
[~/projects/project5]$
I then found How to change current working directory inside command_not_found_handle (which is exactly one of the things I wanted to do) , which introduced me to shopt -s autocd. However, this still doesn't handle the case where the supplied directory is not in ./.
In addition, if I want to do things like setting the http_proxy variable from a python script, or even update the PATH variable, what are my options?
P. S. I understand that there probably isn't an obvious way to write a magical command inside a python script that automatically updates environment variables in the calling shell. I'm looking for a working solution, not necessarily one that's elegant.

This can only be done with the parent shell's involvement and assistance. For a real-world example of a program that does this, you can look at how ssh-agent is supposed to be used:
eval "$(ssh-agent -s)"
...reads the output from ssh-agent and runs it in the current shell (-s specifies Bourne-compatible output, vs csh).
If you're using Python, be sure to use pipes.quote() (or, for Python 3.x, shlex.quote()) to process your output safely:
import pipes
dirname='/path/to/directory with spaces'
foo_val='value with * wildcards * that need escaping and \t\t tabs!'
print 'cd %s; export FOO=%s;' % (pipes.quote(dirname), pipes.quote(foo_val))
...as careless use can otherwise lead to shell injection attacks.
By contrast, if you're writing this as an external script in bash, be sure to use printf %q for safe escaping (though note that its output is targeted for other bash shells, not for POSIX sh compliance):
#!/bin/bash
dirname='/path/to/directory with spaces'
foo_val='value with * wildcards * that need escaping and \t\t tabs!'
printf 'cd %q; export FOO=%q;' "$dirname" "$foo_val"
If, as it appears from your question, you want your command to appear to be written as a native shell function, I would suggest wrapping it in one (this practice can also be used with command_not_found_handle). For instance, installation can involve putting something like the following in one's .bashrc:
my_command() {
eval "$(command /path/to/my_command.py "$#")"
}
...that way users aren't required to type eval.

Essentially, Charles Duffy hit the nail on the head, I present here another spin on the issue.
What you're basically asking about is interprocess communication: You have a process, which may or may not be a subprocess of the shell (I don't think that matters too much), and you want that process to communicate information to the original shell (just another process, btw), and have it change its state.
One possibility is to use signals. For example, in your shell you could have:
trap 'cd /tmp; pwd;' SIGUSR2
Now:
Type echo $$ in your shell, this will give you a number, PID
cd to a directory in your shell (any directory other than /tmp)
Go to another shell (in another window or what have you), and type: kill SIGUSR2 PID
You will find that you are in /tmp in your original shell.
So that's an example of the communication channel. The devil of course is in the details. There are two halves to your problem: How to get the shell to communicate to your program (the command_not_found_handle would do that nicely if that would work for you), and how to get your program to communicate to the shell. Below, I cover the latter issue:
You could, for example, have a trap statement in the original shell:
trap 'eval $(/path/to/my/fancy/command $(pwd) $$)' SIGUSR2
...your fancy command will be given the current working directory of the original shell as the first argument, and the process id of the shell (so it knows who to signal), and it can act upon it. If your command sends an executable shell command string to the eval command, it will be executed in the environment of the original shell.
For example:
trap 'eval $(/tmp/doit $$ $(pwd)); pwd;' SIGUSR2
/tmp/doit is the fancy command. It could be any executable type [Python, C, Perl, etc.]), the key is that it spits out a string that the shell can evaluate. In /tmp/doit, I have provided a bash script:
#!/bin/bash
echo "echo PID: $1 original directory: $2; cd /tmp"
(I make sure the file is executable with: chmod 755 /tmp/doit). Now if I type:
cd; echo $$
Then, in another shell, take the number output ("NNNNN") by the above echo and do:
kill -s SIGUSR2 NNNNN
...then suddenly I will see something like this pop up in the original shell:
PID: NNNNN original directory: /home/myhomepath
/tmp
and if I type "pwd" in my original shell, I will see that I'm in /tmp.
The guy who wanted command_not_found_handle to do something in the current shell environment could have used signals to get the effect he wanted. Here I was running the kill manually but there's no reason why a shell function couldn't do it.
Doing fancy work on the frontend, whereby you re-interpret or pre-interpret the user's input to the shell, may require that the user runs a frontend program that could be pretty complicated, depending on what you want to do. The old school "expect" program is ideal for something like this, but not too many youngsters pick up TCL these days :-) .

Prevent creating new child process using subprocess in Python

I need to run a lot of bash commands from Python. For the moment I'm doing this with
subprocess.Popen(cmd, shell=True)
Is there any solution to run all these commands in the same shell? subprocess.Popen opens a new shell at every execution and I need to set up all the necessary variables at every call, in order for cmd command to work properly.

subprocess.Popen lets you supply a dictionary of environment variables, which will become the environment for the process being run. If the only reason you need shell=True is to set an environment variable, then I suggest you use an explicit environment dictionary instead; it's safer and not particularly difficult. Also, it's usually much easier to construct command invocations when you don't have to worry about quoting and shell metacharacters.
It may not even be necessary to construct the environment dictionary, if you don't mind having the environment variables set in the running process. (Most of the time, this won't be a problem, but sometimes it is. I don't know enough about your application to tell.)
If you can modify your own environment with the settings, just do that:
os.environ['theEnvVar'] = '/the/value'
Then you can just use a simple Popen.call (or similar) to run the command:
output = subprocess.check_output(["ls", "-lR", "/tmp"])
If for whatever reason you cannot change your own environment, you need to make a copy of the current environment, modify it as desired, and pass it to each subprocess.call:
env = os.environ.copy()
env['theEnvVar'] = '/the/value'
output = subprocess.check_output(["ls", "-lR", "/tmp"], env=env)
If you don't want to have to specify env=env every time, just write a little wrapper class.

Why not just create a shell script with all the commands you need to run, then just use a single subprocess.Popen() call to run it? If the contents of the commands you need to run depend on results calculated in your Python script, you can just create the shell script dynamically, then run it.

Use multiprocessing instead, it's more lightweight and efficient.
Unlike subprocess.Popen it does not open a new shell at every execution.
You didn't say you need to run subprocess.Popen and you may well not need to; you just said that's what you're currently doing. More justification please.
See set env var in Python multiprocessing.Process for how to set your env vars once and for all in the parent process.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.