Alternative to hardcoding Python interpreter exec during unittest - python

Assume a unittest test, in which a multi-line output file is generated via a Python script (that uses argparse) and the equality of that file against an expected outcome is compared.
def test_actual_vs_expected_output(self):
actual_inp = '/path_to/actu_inp.txt'
expect_otp = '/path_to/expe_otp.txt'
actual_otp = '/path_to/actu_otp.txt'
myScript = '/path_to/myScript.py'
cmd_list = ['python2', myScript,
'-i', actual_inp,
'-o', actual_otp]
try:
subprocess.check_output(' '.join(cmd_list), shell=True)
except subprocess.CalledProcessError as e:
print e.output
if os.path.isfile(actual_otp):
expect_str = open(expect_otp).read()
actual_str = open(actual_otp).read()
self.assertMultiLineEqual(expect_str, actual_str)
How can I avoid hardcoding the calling of python2 (i.e., in cmd_list of the above example)? After all, the Python2 interpreter may be called differently on different systems.

To call Python in a subprocess, you can use the currently running Python interpreter. The full path of this interpreter is given by the global variable sys.executable.
So, you can write:
import sys
cmd_list = [sys.executable, myScript,
'-i', actual_inp,
'-o', actual_otp]
A comment: the subprocess.check_output function accept a list of arguments, so you can pass the cmd_list parameter as-is (you don't need to join):
subprocess.check_output(cmd_list, shell=True)
Another comment: A python script may write error messages in STDERR. You may consider an alternative to check_output to get the error message or use stderr=subprocess.STDOUT.

Related

Python subprocess time calls /usr/bin/time instead of keyword

I'm trying to get the time of a process, and when I use the keyword time in the shell, I get a nicer output as:
real 0m0,430s
user 0m0,147s
sys 0m0,076s
Instead of the /usr/bin/time which gives a different output. When I try to run it through python's subprocess library with subprocess.call('time command args',shell=True) it gives me the /usr/bin/time instead of the keyword. How can I use the keyword function as opposed to the current one?
shell=True causes subprocess to use /bin/sh, not bash. You need the executable argument as well
subprocess.call('time command args', shell=True, executable='/bin/bash')
Adjust the path to bash as necessary.

How to get output of OS command from Jupyter notebook?

I am running Jupyter notebook on a server (python 3).
Want to see output of OS command (any OS command - this is just example):
output = os.system("pwd")
When I do print of it:
print (output)
Response is 0.
How to get a simple output (like in CLI)?
Thanks.
Just found it on internet and wanted to post.
It needs to be:
print(os.popen('ls').read())
(or any other OS command).
This works fine.
import os
print(os.getcwd())
print(os.system("pwd"))
But this question is a duplicate:
how to show current directory in ipython promp
Note that os.system() calls are not the preferred way to run any commands and do not ensure capturing the output (see here for more on this).
The preferred and safer mechanism which will capture the output of commands is subprocess.run() which has a capture_output parameter and returns a CompletedProcess object that has members including stdout and stderr which, if capture_output=True contain the output stream contents.
It is worth mentioning that for portability it is usually better to use the methods from the os, shutil, path & glob libraries, etc. This is because calls such as ls, pwd, dir, etc., will work on some platforms but not others.
Example:
import subprocess
result = subprocess.run(['cwd',], capture_output=True)
# Returns the current directory as a string for processing in result.stdout on Mac/Linux but raises an exception on Windows
print(result.stdout)
result = subprocess.run(['ls', '*.docx'], capture_output=True)
# Returns the *.docx files in the current directory as a string for processing in result.stdout on Mac/Linux but raises an exception on Windows
print(result.stdout)
However:
import pathlib
cwd = pathlib.Path.cwd() # Returns the current directory as a Path object on any platform.
print(cwd)
docs = cwd.glob('*.docx') # Returns an generator giving path for to any docx files in cwd on any platform
print(', '.join(p.name for p in docs)) # Print comma seperated list of filenames
Note that for long running or very verbose calls it is often better to use the subprocess.POpen constructor and communicate or wait.
If you want to start an external process and not wait for it to finish then use the asynco create_subprocess_exec() call instead.

import a python module whose location is unknown

I would like to import a function (function.py) from a given module in python (MOD.py), whose location I do not know. For it, I have performed two steps:
First step, I get the path to the directory that contains the module:
path = subprocess.check_output(['find', 'home/scripts','-iname','MOD.py','|','sed','s/\/MOD.py//g']).rstrip()
Secondly, I point at this directory to get the function from the module:
sys.path.insert(0,'{0}'.format(path))
from MOD import function
The code written is failing in the first step, particularly in the sed. Why is it not working? Is there a clearer way to do the first step? Is it necessary to do two steps, or is it possible to do it with one python instruction?
Thanks!
First note that you could not use pipe like that ! for using pipe you have to pass shell=True so instead of check_output use Popen, also your code failed in path argument of find add a / before home .
If the executed command returns a nonzero exit code, an exception is raised. you can use a try-except with subprocess.CalledProcessError to catch errors and getting the output created along with the exit code :
import subprocess
try:
ps = subprocess.Popen(['find', '/home/scripts','-iname','MOD.py','|','sed','s/\/MOD.py//g'],shell=True,stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
path= ps.communicate()[0]
except subprocess.CalledProcessError as e:
out_bytes = e.output
code= e.returncode
in addition as a more secure method i suggest don't use shell=True instead use tow command :
ps = subprocess.Popen(['find', '/home/scripts','-iname','MOD.py'], stdout=subprocess.PIPE)
path = subprocess.check_output(['sed','s/\/MOD.py//g'], stdin=ps.stdout)

Python subprocess.Popen fails on shell command as argument

Struggling with subprocess.Popen() - why is First and Third working as expected, while the second does not find any of multiple files or directories? Error message is:
>ls: Zugriff auf * nicht möglich: Datei oder Verzeichnis nicht gefunden
English translation:
File not found or directory: access to * not possible
Here is the code.
#!/usr/bin/python
# -*- coding: UTF-8 -*-
import subprocess
args = []
args.append ('ls')
args.append ('-al')
# First does work
cmd1 = subprocess.Popen(args)
cmd1.wait()
# Second does NOT work
args.append ('*')
cmd2 = subprocess.Popen(args)
cmd2.wait()
# Third does work
shellcmd = "ls -al *"
cmd3 = subprocess.Popen(shellcmd, shell=True )
cmd3.wait()
This is because by default subprocess.Popen() doesn't have the shell interpret the commands, so the "*" isn't being expanded into the required list of files. Try adding shell=True as a final argument to the call.
Also note the warning in the documentation about not trusting user input to be processed in this way.
This is happening because of shell globbing.
Basically, the * in ls -al * is expanded by your shell, to match all available files.
When you run the subprocess without the shell=True flag, python is not able to parse the * on its own, and hence, the error message ls: cannot access *: No such file or directory is displayed.
When you run the command with shell=True, python actually passes the control to the shell, and hence the correct output is displayed.
As an aside, executing shell commands that incorporate unsanitized input from an untrusted source makes a program vulnerable to shell injection, a serious security flaw which can result in arbitrary command execution, so it should be used with caution (see warning here).
EDIT 1
Both shell globbing and the way Popen consumes args is causing the issue here
From subprocess module,
class subprocess.Popen
args should be a sequence of program arguments or else a single string.
If shell is True, it is recommended to pass argsas astring` rather than as a sequence.
To understand that shell globbing and the manner in which Popen consumes args is the issue here, compare the output of the following. Note that in 2 cases, when shell=True, only ls is executed since the input passed is a list and not a string, against the recommendation
subprocess.Popen(['ls']) #works
subprocess.Popen('ls') #works
subprocess.Popen(['ls', '-al']) #works
subprocess.Popen(['ls -al']) #doesn't work raises OSError since not a single command
subprocess.Popen('ls -al') #doesn't work raises OSError since not a single command
subprocess.Popen(['ls -al'], shell=True) #works since in shell mode
subprocess.Popen('ls -al', shell=True) #works since in shell mode & string is single command
subprocess.Popen(['ls', '-al'], shell=True) #output corresponds to ls only, list passed instead of string, against recommendation
subprocess.Popen(['ls', '-al', '*']) #doesn't work because of shell globbing for *
subprocess.Popen(['ls -al *']) #doesn't work raises OSError since not a single commandfor *
subprocess.Popen('ls -al *') #doesn't work raises OSError since not a single commandvalid arg
subprocess.Popen(['ls', '-al', '*'], shell=True) #output corresponds to ls only, list passed instead of string, against recommendation
subprocess.Popen(['ls -al *'], shell=True) #works
subprocess.Popen('ls -al *', shell=True) #works
Not a direct answer to your question, but you can also try using the python library sh
example:
from sh import ls
print ls("-al")
link to more examples

subprocess wildcard usage

import os
import subprocess
proc = subprocess.Popen(['ls','*.bc'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out,err = proc.communicate()
print out
This script should print all the files with .bc suffix however it returns an empty list. If I do ls *.bc manually in the command line it works. Doing ['ls','test.bc'] inside the script works as well but for some reason the star symbol doesnt work.. Any ideas ?
You need to supply shell=True to execute the command through a shell interpreter.
If you do that however, you can no longer supply a list as the first argument, because the arguments will get quoted then. Instead, specify the raw commandline as you want it to be passed to the shell:
proc = subprocess.Popen('ls *.bc', shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
Expanding the * glob is part of the shell, but by default subprocess does not send your commands via a shell, so the command (first argument, ls) is executed, then a literal * is used as an argument.
This is a good thing, see the warning block in the "Frequently Used Arguments" section, of the subprocess docs. It mainly discusses security implications, but can also helps avoid silly programming errors (as there are no magic shell characters to worry about)
My main complaint with shell=True is it usually implies there is a better way to go about the problem - with your example, you should use the glob module:
import glob
files = glob.glob("*.bc")
print files # ['file1.bc', 'file2.bc']
This will be quicker (no process startup overhead), more reliable and cross platform (not dependent on the platform having an ls command)
Besides doing shell=True, also make sure that your path is not quoted. Otherwise it will not be expanded by shell.
If your path may have special characters, you will have to escape them manually.

Categories