unit tests for command line args python - python

I have a shell script which currently takes 3 args. I run this via a shell script with the shell script file name , a directory to run the python script upon , along with the name of the test data directory. I want to be able to write a unit tests which executes the command below but only if i was to change the date , depending on the data that is available it would either pass or fail.
main_config.sh
yamldir=$1
for yaml in $(ls ${yamldir}/*.yaml | grep -v "export_config.yaml"); do
if [ "$yaml" != "export_config.yaml" ]; then
echo "Running export for $yaml file...";
python valid.py -p ${yamldir}/export_config.yaml -e $yaml -d ${endDate}
wait
fi
done
This is what is executed on the command line
./main_config.sh /Users/name/Desktop/yaml/ 2018-12-23
This will fail and output on the terminal since there is no directory called 2012-12-23 :
./main_config.sh /yaml/ 2018-12-23
Running export for apa.yaml file...
apa.json does not exist
If the directory existed this would pass and would output on the terminal :
Running export for apa.yaml file...
File Name: apa.json Exists
File Size: 234 Bytes
Writing to file
My python script script is as follows :
def main(get_config):
cfg = get_config()[0] # export_config.yaml
data = get_config()[1] # export_apa.yaml
date = get_config()[2] # data folder - YYYY-MM-DD
# Conditional Logic
def get_config():
parser = argparse.ArgumentParser()
parser.add_argument("-p", "--parameter-file", action="store", required=True)
parser.add_argument("-e", "--export-data-file", action="store", required=True)
parser.add_argument("-d", "--export-date", action="store", required=False)
args = parser.parse_args()
return [funcs.read_config(args.parameter_file), funcs.read_config(args.export_data_file), args.export_date]
if __name__ == "__main__":
logging.getLogger().setLevel(logging.INFO)
main(get_config)

To me it looks like this is not a typical unit test (that tests a function or method) but an integration test (that tests a subsystem from the outside). But of course you could still solve this with your typical Python testing tools like unittest.
A simple solution would be to run your script using subprocess, capture the output, and then parse that output as part of your test:
import unittest
import os
import sys
if os.name == 'posix' and sys.version_info[0] < 3:
import subprocess32 as subprocess
else:
import subprocess
class TestScriptInvocation(unittest.TestCase):
def setUp(self):
"""call the script and record its output"""
result = subprocess.run(["./main_config.sh", "/Users/yasserkhan/Desktop/yaml/", "2018-12-23"], stdout=subprocess.PIPE)
self.returncode = result.returncode
self.output_lines = result.stdout.decode('utf-8').split('\n')
def test_returncode(self):
self.assertEqual(self.returncode, 0)
def test_last_line_indicates_success(self):
self.assertEqual(self.output_lines[-1], 'Writing to file')
if __name__ == '__main__':
unittest.main()
Note that this code uses the backport of the Python 3 subprocess module. Also, it tries to decode the contents of result.stdout because on Python 3 that would be a bytes object and not a str as on Python 2. I didn't test it, but these two things should make the code portable between 2 and 3.
Also note that using absolute paths like "/Users/yasserkhan/Desktop/yaml" could easily break, so you will either need to find a relative path or pass a base path to your tests using environment variables for example.
You could add additional tests that parse the other lines and check for reasonable outputs like a file size in the expected range.

Related

How to hide args to argparse when running script with exec?

I have 2 files, runner.py that runs target.py with subprocess or exec.
They both have command line options.
If runner runs target with subprocess it's ok:
$ python runner.py
run target.py with subprocess...
target.py: running with dummy = False
If runner runs target code with exec (with the -e option):
$ python runner.py -e
run target.py with exec...
usage: runner.py [-h] [-d]
runner.py: error: unrecognized arguments: -e
the command line argument -e is "seen" by target.py code (which accepts only one --dummy option) and raises an error.
How can I hide args to argparse when running script with exec?
Here's the code:
runner.py
import subprocess
import argparse
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("-e", "--exec", help="run with exec", action="store_true")
args = parser.parse_args()
target_filename = "target.py"
if args.exec:
print("run target.py with exec...")
source_code = open(target_filename).read()
compiled = compile(source_code, filename=target_filename, mode="exec")
exec(compiled) # OPTION 1 - error on argparse
# exec(compiled, {}) # OPTION 2 - target does not go inside "if main"
# exec(compiled, dict(__name__="__main__")) # OPTION 3 - same error as OPTION 1
else:
print("run target.py with subprocess...")
subprocess.run(["python3", target_filename])
I tried to hide the globals with the commented options above, but without luck.
Seems related to how argparse works.
target.py
import argparse
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("-d", "--dummy", help="a dummy option", action="store_true")
args = parser.parse_args()
print(f"target.py: running with dummy = {args.dummy}")
There is the argparse conflict_handler option so one can write
argparse.ArgumentParser(conflict_handler='resolve') in target script.
resolve removes the conflicting options, but that doesn't handle well similar cases where the options have the same name both in runner and target or the case where you can't or don't want to change the target file.
Here's the solution I have found.
Internally argparse uses sys.argv to retrieve options set with command line.
You can directly set sys.argv = [target_filename] which removes the options, but changing sys can give a lot of other problems.
Using unittest.mock.patch (python3.4+) the sys.argv can be securely altered like this:
from unittest.mock import patch
# [...]
source_code = open(target_filename).read()
compiled = compile(source_code, filename=target_filename, mode="exec")
# remove command args
with patch('sys.argv', [target_filename]):
exec(compiled)
So one can also run target script code with options:
# run target
with patch('sys.argv', [target_filename]):
exec(compiled)
# run target with -d
with patch('sys.argv', [target_filename, "-d"]):
exec(compiled)

debugging argpars in python

May I know what is the best practice to debug an argpars function.
Say I have a py file test_file.py with the following lines
# Script start
import argparse
import os
parser = argparse.ArgumentParser()
parser.add_argument(“–output_dir”, type=str, default=”/data/xx”)
args = parser.parse_args()
os.makedirs(args.output_dir)
# Script stop
The above script can be executed from terminal by:
python test_file.py –output_dir data/xx
However, for debugging process, I would like to avoid using terminal. Thus the workaround would be
# other line were commented for debugging process
# Thus, active line are
# Script start
import os
args = {“output_dir”:”data/xx”}
os.makedirs(args.output_dir)
#Script stop
However, I am unable to execute the modified script. May I know what have I miss?
When used as a script, parse_args will produce a Namespace object, which displays as:
argparse.Namespace(output_dir='data/xx')
then
args.output_dir
will be the value of that attribute
In the test you could do one several things:
args = parser.parse_args([....]) # a 'fake' sys.argv[1:] list
args = argparse.Namespace(output_dir= 'mydata')
and use args as before. Or simply call the
os.makedirs('data/xx')
I would recommend organizing the script as:
# Script start
import argparse
import os
# this parser definition could be in a function
parser = argparse.ArgumentParser()
parser.add_argument(“–output_dir”, type=str, default=”/data/xx”)
def main(args):
os.makedirs(args.output_dir)
if __name__=='__main__':
args = parser.parse_args()
main(args)
That way the parse_args step isn't run when the file is imported. Whether you pass the args Namespace to main or pass values like args.output_dir, or a dictionary, etc. is your choice.
You can write it in a shell script to do what you want
bash:
#!/usr/bin/
cd /path/to/my/script.py
python script.py --output_dir data/xx
If that is insufficient, you can store your args in a json config file
configs.json
{"output_dir": "data/xx"}
To grab them:
import json
with open('configs.json', 'rb') as fh:
args = json.loads(fh.read())
output_dir = args.get('output_dir')
# 'data/xx'
Do take note of the double quotes around your keys and values in the json file

How to handle python raw_input() in batch file

I created a batch file to run files in sequence, however my python file takes in an input (from calling raw_input), and I am trying to figure out how to handle this over the batch file.
run.bat
The program doesn't proceed to the next line after the .py file is executed, for brevity i just just showed necessary commands
cd C:\Users\myname\Desktop
python myfile.py
stop
myfile.py
print ("Enter environment (dev | qa | prod) or stop to STOP")
environment = raw_input()
Here's a solution.
Take your myfile.py file, and change it to the following:
import sys
def main(arg = None):
# Put all of your original myfile.py code here
# You could also use raw_input for a fallback, if no arg is provided:
if arg is None:
arg = raw_input()
# Keep going with the rest of your script
# if __name__ == "__main__" ensures this code doesn't run on import statements
if __name__ == "__main__":
#arg = sys.argv[1] allows you to run this myfile.py directly 1 time, with the first command line paramater, if you want
if len(sys.argv) > 0:
arg = sys.argv[1]
else:
arg = None
main(arg)
Then create another python file, called wrapper.py:
#importing myfile here allows you to use it has its own self contained module
import sys, myfile
# this loop loops through the command line params starting at index 1 (index 0 is the name of the script itself)
for arg in sys.argv[1 : ]:
myfile.main(arg)
And then at the command line, you can simply type:
python wrapper.py dev qa prod
You could also put the above line of code in your run.bat file, making it look at follows:
cd C:\Users\myname\Desktop
python wrapper.py dev qa prod
stop
the question is not related to python. To your shell. According http://ss64.com/nt/syntax-redirection.html cmd.exe uses the same syntax as unix shell (cmd1 | cmd2), so your bat file should work fine when called with command, which will send the file content to standard output.
Edit: added example
echo "dev" | run.bat
C:\Python27\python.exe myfile.py
Enter environment (dev | qa | prod) or stop to STOP
environment="dev"

Logging last Bash command to file from script

I write lots of small scripts to manipulate files on a Bash-based server. I would like to have a mechanism by which to log which commands created which files in a given directory. However, I don't just want to capture every input command, all the time.
Approach 1: a wrapper script that uses a Bash builtin (a la history or fc -ln -1) to grab the last command and write it to a log file. I have not been able to figure out any way to do this, as the shell builtin commands do not appear to be recognized outside of the interactive shell.
Approach 2: a wrapper script that pulls from ~/.bash_history to get the last command. This, however, requires setting up the Bash shell to flush every command to history immediately (as per this comment) and seems also to require that the history be allowed to grow inexorably. If this is the only way, so be it, but it would be great to avoid having to edit the ~/.bashrc file on every system where this might be implemented.
Approach 3: use script. My problem with this is that it requires multiple commands to start and stop the logging, and because it launches its own shell it is not callable from within another script (or at least, doing so complicates things significantly).
I am trying to figure out an implementation that's of the form log_this.script other_script other_arg1 other_arg2 > file, where everything after the first argument is logged. The emphasis here is on efficiency and minimizing syntax overhead.
EDIT: iLoveTux and I both came up with similar solutions. For those interested, my own implementation follows. It is somewhat more constrained in its functionality than the accepted answer, but it also auto-updates any existing logfile entries with changes (though not deletions).
Sample usage:
$ cmdlog.py "python3 test_script.py > test_file.txt"
creates a log file in the parent directory of the output file with the following:
2015-10-12#10:47:09 test_file.txt "python3 test_script.py > test_file.txt"
Additional file changes are added to the log;
$ cmdlog.py "python3 test_script.py > test_file_2.txt"
the log now contains
2015-10-12#10:47:09 test_file.txt "python3 test_script.py > test_file.txt"
2015-10-12#10:47:44 test_file_2.txt "python3 test_script.py > test_file_2.txt"
Running on the original file name again changes the file order in the log, based on modification time of the files:
$ cmdlog.py "python3 test_script.py > test_file.txt"
produces
2015-10-12#10:47:44 test_file_2.txt "python3 test_script.py > test_file_2.txt"
2015-10-12#10:48:01 test_file.txt "python3 test_script.py > test_file.txt"
Full script:
#!/usr/bin/env python3
'''
A wrapper script that will write the command-line
args associated with any files generated to a log
file in the directory where the files were made.
'''
import sys
import os
from os import listdir
from os.path import isfile, join
import subprocess
import time
from datetime import datetime
def listFiles(mypath):
"""
Return relative paths of all files in mypath
"""
return [join(mypath, f) for f in listdir(mypath) if
isfile(join(mypath, f))]
def read_log(log_file):
"""
Reads a file history log and returns a dictionary
of {filename: command} entries.
Expects tab-separated lines of [time, filename, command]
"""
entries = {}
with open(log_file) as log:
for l in log:
l = l.strip()
mod, name, cmd = l.split("\t")
# cmd = cmd.lstrip("\"").rstrip("\"")
entries[name] = [cmd, mod]
return entries
def time_sort(t, fmt):
"""
Turn a strftime-formatted string into a tuple
of time info
"""
parsed = datetime.strptime(t, fmt)
return parsed
ARGS = sys.argv[1]
ARG_LIST = ARGS.split()
# Guess where logfile should be put
if (">" or ">>") in ARG_LIST:
# Get position after redirect in arg list
redirect_index = max(ARG_LIST.index(e) for e in ARG_LIST if e in ">>")
output = ARG_LIST[redirect_index + 1]
output = os.path.abspath(output)
out_dir = os.path.dirname(output)
elif ("cp" or "mv") in ARG_LIST:
output = ARG_LIST[-1]
out_dir = os.path.dirname(output)
else:
out_dir = os.getcwd()
# Set logfile location within the inferred output directory
LOGFILE = out_dir + "/cmdlog_history.log"
# Get file list state prior to running
all_files = listFiles(out_dir)
pre_stats = [os.path.getmtime(f) for f in all_files]
# Run the desired external commands
subprocess.call(ARGS, shell=True)
# Get done time of external commands
TIME_FMT = "%Y-%m-%d#%H:%M:%S"
log_time = time.strftime(TIME_FMT)
# Get existing entries from logfile, if present
if LOGFILE in all_files:
logged = read_log(LOGFILE)
else:
logged = {}
# Get file list state after run is complete
post_stats = [os.path.getmtime(f) for f in all_files]
post_files = listFiles(out_dir)
# Find files whose states have changed since the external command
changed = [e[0] for e in zip(all_files, pre_stats, post_stats) if e[1] != e[2]]
new = [e for e in post_files if e not in all_files]
all_modded = list(set(changed + new))
if not all_modded: # exit early, no need to log
sys.exit(0)
# Replace files that have changed, add those that are new
for f in all_modded:
name = os.path.basename(f)
logged[name] = [ARGS, log_time]
# Write changed files to logfile
with open(LOGFILE, 'w') as log:
for name, info in sorted(logged.items(), key=lambda x: time_sort(x[1][1], TIME_FMT)):
cmd, mod_time = info
if not cmd.startswith("\""):
cmd = "\"{}\"".format(cmd)
log.write("\t".join([mod_time, name, cmd]) + "\n")
sys.exit(0)
You can use the tee command, which stores its standard input to a file and outputs it on standard output. Pipe the command line into tee, and pipe tee's output into a new invocation of your shell:
echo '<command line to be logged and executed>' | \
tee --append /path/to/your/logfile | \
$SHELL
i.e., for your example of other_script other_arg1 other_arg2 > file,
echo 'other_script other_arg1 other_arg2 > file' | \
tee --append /tmp/mylog.log | \
$SHELL
If your command line needs single quotes, they need to be escaped properly.
OK, so you don't mention Python in your question, but it is tagged Python, so I figured I would see what I could do. I came up with this script:
import sys
from os.path import expanduser, join
from subprocess import Popen, PIPE
def issue_command(command):
process = Popen(command, stdout=PIPE, stderr=PIPE, shell=True)
return process.communicate()
home = expanduser("~")
log_file = join(home, "command_log")
command = sys.argv[1:]
with open(log_file, "a") as fout:
fout.write("{}\n".format(" ".join(command)))
out, err = issue_command(command)
which you can call like (if you name it log_this and make it executable):
$ log_this echo hello world
and it will put "echo hello world" in a file ~/command_log, note though that if you want to use pipes or redirection you have to quote your command (this may be a real downfall for your use case or it may not be, but I haven't figured out how to do this just yet without the quotes) like this:
$ log_this "echo hello world | grep h >> /tmp/hello_world"
but since it's not perfect, I thought I would add a little something extra.
The following script allows you to specify a different file to log your commands to as well as record the execution time of the command:
#!/usr/bin/env python
from subprocess import Popen, PIPE
import argparse
from os.path import expanduser, join
from time import time
def issue_command(command):
process = Popen(command, stdout=PIPE, stderr=PIPE, shell=True)
return process.communicate()
home = expanduser("~")
default_file = join(home, "command_log")
parser = argparse.ArgumentParser()
parser.add_argument("-f", "--file", type=argparse.FileType("a"), default=default_file)
parser.add_argument("-p", "--profile", action="store_true")
parser.add_argument("command", nargs=argparse.REMAINDER)
args = parser.parse_args()
if args.profile:
start = time()
out, err = issue_command(args.command)
runtime = time() - start
entry = "{}\t{}\n".format(" ".join(args.command), runtime)
args.file.write(entry)
else:
out, err = issue_command(args.command)
entry = "{}\n".format(" ".join(args.command))
args.file.write(entry)
args.file.close()
You would use this the same way as the other script, but if you wanted to specify a different file to log to just pass -f <FILENAME> before your actual command and your log will go there, and if you wanted to record the execution time just provide the -p (for profile) before your actual command like so:
$ log_this -p -f ~/new_log "echo hello world | grep h >> /tmp/hello_world"
I will try to make this better, but if you can think of anything else this could do for you, I am making a github project for this where you can submit bug reports and feature requests.

Need help with python script with bash commands

I copied this script from internet but idon't know how to use it. i am newbiw to python so please help. When i execute it using
./test.py then i can only see
usage: py4sa [option]
A unix toolbox
options:
--version show program's version number and exit
-h, --help show this help message and exit
-i, --ip gets current IP Address
-u, --usage gets disk usage of homedir
-v, --verbose prints verbosely
when i type py4sa then it says bash command not found
The full script is
#!/usr/bin/env python
import subprocess
import optparse
import re
#Create variables out of shell commands
#Note triple quotes can embed Bash
#You could add another bash command here
#HOLDING_SPOT="""fake_command"""
#Determines Home Directory Usage in Gigs
HOMEDIR_USAGE = """
du -sh $HOME | cut -f1
"""
#Determines IP Address
IPADDR = """
/sbin/ifconfig -a | awk '/(cast)/ { print $2 }' | cut -d':' -f2 | head -1
"""
#This function takes Bash commands and returns them
def runBash(cmd):
p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
out = p.stdout.read().strip()
return out #This is the stdout from the shell command
VERBOSE=False
def report(output,cmdtype="UNIX COMMAND:"):
#Notice the global statement allows input from outside of function
if VERBOSE:
print "%s: %s" % (cmdtype, output)
else:
print output
#Function to control option parsing in Python
def controller():
global VERBOSE
#Create instance of OptionParser Module, included in Standard Library
p = optparse.OptionParser(description='A unix toolbox',
prog='py4sa',
version='py4sa 0.1',
usage= '%prog [option]')
p.add_option('--ip','-i', action="store_true", help='gets current IP Address')
p.add_option('--usage', '-u', action="store_true", help='gets disk usage of homedir')
p.add_option('--verbose', '-v',
action = 'store_true',
help='prints verbosely',
default=False)
#Option Handling passes correct parameter to runBash
options, arguments = p.parse_args()
if options.verbose:
VERBOSE=True
if options.ip:
value = runBash(IPADDR)
report(value,"IPADDR")
elif options.usage:
value = runBash(HOMEDIR_USAGE)
report(value, "HOMEDIR_USAGE")
else:
p.print_help()
#Runs all the functions
def main():
controller()
#This idiom means the below code only runs when executed from command line
if __name__ == '__main__':
main()
It seems to me you have stored the script under another name: test.py rather than py4sa. So typing ./test.py, like you did, is correct for you. The program requires arguments, however, so you have to enter one of the options listed under 'usage'.
Normally 'py4sa [OPTIONS]' would mean that OPTIONS is optional, but looking at the code we can see that it isn't:
if options.verbose:
# ...
if options.ip:
# ...
elif options.usage:
# ...
else:
# Here's a "catch all" in case no options are supplied.
# It will show the help text you get:
p.print_help()
Note that the program probably would not be recognized by bash even if you renamed it to py4sa, as the current directory is often not in bash's PATH. It says 'usage: py4sa (..)' because that's hard-coded into the program.
The script is called "test.py". Either invoke it as such, or rename it to "py4sa".
you run a Python script using the interpreter, so
$ python py4sa

Categories