Unittest with command-line arguments - python

From what I understand from another SO post, to unittest a script that takes command line arguments through argparse, I should do something like the code below, giving sys.argv[0] as arg.
import unittest
import match_loc
class test_method_main(unittest.TestCase):
loc = match_loc.main()
self.assertEqual(loc, [4])
if __name__ == '__main__':
sys.argv[1] = 'aaaac'
sys.argv[2] = 'ac'
unittest.main(sys.argv[0])
This returns the error:
usage: test_match_loc.py [-h] text patterns [patterns ...]
test_match_loc.py: error: the following arguments are required: text, patterns
I would like to understand what is going on here deeper. I understand
if __name__ == '__main__':
main()
says that if this is being executed by the 'main', highest level, default interpreter, to just automatically run the 'main' method. I'm assuming
if __name__ == '__main__':
unittest.main()
just happens to be the way you say this for running unittest scripts.
I understand when any script is run, it automatically has an argv object, a vector collecting all the items on the command line.
But I do not understand what unittest.main(sys.arg[0]) would do. What does 'unittest.main' do with arguments? How can I pre-set the values of sys.argv - doesn't it automatically reset every time you run a script? Furthermore, where does this object 'sys.argv' exist, if outside of any script? Finally, what is the correct way to implement tests of command-line arguments?
I am sorry if my questions are vague and misguided. I would like to understand all the components relevant here so I can actually understand what I am doing.
Thank you very much.

Just by playing around with a pair of simple files, I find that modifying sys.argv in the body of the caller module affects the sys.argv that the imported module sees:
import sys
sys.argv[1] = 'aaaac'
sys.argv[2] = 'ac'
class test_method_main(unittest.TestCase):
...
But modifying sys.argv in the main block as you do, does not show up in the imported one. We could dig into the documentation (and code) to see exactly why, but I think it's enough to just identify what works.
Here's what I reconstructed from your previous question of the imported module - with a few diagnostic prints
import argparse
import sys
def main():
print(sys.argv)
parser = argparse.ArgumentParser(
description='Takes a series of patterns as fasta files'
' or strings and a text as fasta file or string and'
' returns the match locations by constructing a trie.')
parser.add_argument('text')
parser.add_argument('patterns', nargs='+')
args = parser.parse_args()
print(args)
return 1
You could also test a parser with your own list of strings, recognising that parse_args uses sys.argv[1:] if its argument is missing or None:
def main(argv=None):
print(argv)
...
args = parser.parse_args(argv)
print(args)
return 1
loc = match_loc.main(['abc','ab']) # and in the caller
Even though I was able to construct a working test case, you really should have given enough information that I didn't need to guess or dig around.

Related

How to run different python functions from command line

I'm trying to run different functions from a python script(some with arguments and some without)
So far I have
def math(x):
ans = 2*x
print(ans)
def function1():
print("hello")
if __name__ == '__main__':
globals()[sys.argv[1]]()
and in the command line if I type python scriptName.py math(2)
I get the error
File "scriptName.py", line 28, in <module>
globals()[sys.argv[1]]()
KeyError: 'mat(2)'
New to python and programming so any help would be apprecitated. This is also a general example...my real script will have a lot more functions.
Thank you
Try this!
import argparse
def math(x):
try:
print(int(x) * 2)
except ValueError:
print(x, "is not a number!")
def function1(name):
print("Hello!", name)
if __name__ == '__main__':
# if you type --help
parser = argparse.ArgumentParser(description='Run some functions')
# Add a command
parser.add_argument('--math', help='multiply the integer by 2')
parser.add_argument('--hello', help='say hello')
# Get our arguments from the user
args = parser.parse_args()
if args.math:
math(args.math)
if args.hello:
function1(args.hello)
You run it from your terminal like so:
python script.py --math 5 --hello ari
And you will get
>> 10
>> Hello! ari
You can use --help to describe your script and its options
python script.py --help
Will print out
Run some functions
optional arguments:
-h, --help show this help message and exit
--math MATH multiply the integer by 2
--hello HELLO say hello
Read More: https://docs.python.org/3/library/argparse.html
Here is another approach you can take:
#!/usr/bin/env python3
"""Safely run Python functions from command line.
"""
import argparse
import ast
import operator
def main():
# parse arguments
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("function", help="Python function to run.")
parser.add_argument("args", nargs='*')
opt = parser.parse_args()
# try to get the function from the operator module
try:
func = getattr(operator, opt.function)
except AttributeError:
raise AttributeError(f"The function {opt.function} is not defined.")
# try to safely eval the arguments
try:
args = [ast.literal_eval(arg) for arg in opt.args]
except SyntaxError:
raise SyntaxError(f"The arguments to {opt.function}"
f"were not properly formatted.")
# run the function and pass in the args, print the output to stdout
print(func(*args))
if __name__ == "__main__":
main()
Then you can execute this by doing the following:
./main.py pow 2 2
4
We use the argparse module from Python's Standard Library to facilitate the parsing of arguments here. The usage for the script is below:
usage: main.py [-h] function [args [args ...]]
function is the name of the function you want to run. The way this is currently structured is to pull functions from the operator module, but this is just an example. You can easily create your own file containing functions and use that instead, or just pull them from globals().
Following function you can supply any number of arguments that you want. Those arguments will be ran through ast.literal_eval to safely parse the arguments and get the corresponding types.
The cool thing about this is your arguments are not strictly limited to strings and numbers. You can pass in any literal. Here is an example with a tuple:
./main.py getitem '(1, 2, 3)' 1
2
These arguments are then passed to the selected function, and the output is printed to stdout. Overall, this gives you a pretty flexibly framework in which you can easily expand the functionality. Plus, it avoids having to use eval which greatly reduces the risk of doing something of this nature.
Why not to use eval:
Here is a small example of why just using eval is so unsafe. If you were to simply use the following code to solve your issue:
def math(x):
ans = 2*x
print(ans)
def function1():
print("hello")
if __name__ == '__main__':
print(eval(sys.argv[1]])) # DO NOT DO IT THIS WAY
Someone could pass in an argument like so:
python main.py 'import shutil; shutil.rmtree("/directory_you_really_dont_want_to_delete/")'
Which would in effect, import the shutil module and then call the rmtree function to remove a directory you really do not want to delete. Obviously this is a trivial example, but I am sure you can see the potential here to do something really malicious. An even more malicious, yet easily accessible, example would be to import subprocess and use recursive calls to the script to fork-bomb the host, but I am not going to share that code here for obvious reasons. There is nothing stopping that user from downloading a malicious third party module and executing code from it here (a topical example would be jeilyfish which has since been removed from PyPi). eval does not ensure that the code is "safe" before running it, it just arbitrarily runs any syntactically correct Python code given to it.

Passing directory paths as strings to argparse in Python

Scenario: I have a python script that receives as inputs 2 directory paths (input and output folders) and a variable ID. With these, it performs a data gathering procedure from xlsx and xlsm macros, modifies the data and saves to a csv (from the input folder, the inner functions of the code will run loops, to get multiple files and process them, one at a time).
Issue: Since the code was working fine when I was running it from the Spyder console, I decided to step it up and learn about cmd caller, argparse and the main function. I trying to implement that, but I get the following error:
Unrecognized arguments (the output path I pass from cmd)
Question: Any ideas on what I am doing wrong?
Obs: If the full script is required, I can post it here, but since it works when run from Spyder, I believe the error is in my argparse function.
Code (argparse function and __main__):
# This is a function to parse arguments:
def parserfunc():
import argparse
parser = argparse.ArgumentParser(description='Process Files')
parser.add_argument('strings', nargs=3)
args = parser.parse_args()
arguments = args.strings
return arguments
# This is the main caller
def main():
arguments = parserfunc()
# this next function is where I do the processing for the files, based on the paths and id provided):
modifierfunc(arguments[0], arguments[1], arguments[2])
#
if __name__ == "__main__":
main()
If you decided to use argparse, then make use of named arguments, not indexed. Following is an example code:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('input')
parser.add_argument('output')
parser.add_argument('id')
args = parser.parse_args()
print(args.input, args.output, args.id) # this is how you use them
In case you miss one of them on program launch, you will get human readable error message like
error: the following arguments are required: id
You could drop the entire parserfunc() function.
sys.argv does indeed contain all arguments (always processed as a string) as mentioned by grapes.
So instead of this:
modifierfunc(arguments[0], arguments[1], arguments[2])
This should suffice:
import sys
modifierfunc(sys.argv[0], sys.argv[1], sys.argv[2])
Perhaps, first do a print, to see if the sys.argv holds the values you expect.
print('Argument 0='+sys.argv[0])
print('Argument 1='+sys.argv[1])
print('Argument 2='+sys.argv[2])

I'd like my code to flow with my terminal a little better

So I have a fully functional py script running on Ubuntu 12.04, everything works great. Except I don't like my input methods, it's getting annoying as you'll see below. Before I type out the code, I should say that the code takes two images in a .img format and then does computations on them. Here's what I have:
import os
first = raw_input("Full path to first .img file: ")
second = raw_input("Full path to second .img file: ")
print " "
if os.path.exists(first) == True:
if first.endswith('.img') == False:
print 'You did not give a .img file, try running again'
os.sys.exit()
elif os.path.exists(second) == True:
if second.endswith('.img') == False:
print 'You did not give a .img file, try running again'
os.sys.exit()
else:
print "Your path does not exist, probably a typo. Try again"
os.sys.exit()
Here's what I want; I want to be able to feed python this input straight from the Terminal. In other words, I want to be able to input in the terminal something like
python myscript.py with the two images as input
This way I could make use of the terminal's tab-key shortcut when specifying paths and stuff. Any ideas/suggestions?
EDIT: Ok so I looked into the parsing, and I think I got down how to use it. Here's my code:
import argparse
import nipy
parser = argparse.ArgumentParser()
parser.add_argument("-im", "--image_input", help = "Feed the program an image", type = nipy.core.image.image.Image, nargs = 2)
however now I want to be able to use these files in the script by saying something like first = parser[0] second = parse[1] and do stuff on first and second. Is this achievable?
You want to parse the command line arguments instead of reading input after the program starts.
Use the argparse module for that, or parse sys.argv yourself.
Seeing that the parsing code already exists, all you need to do is accept command-line arguments with Python's sys module:
import sys
first = sys.argv[1]
second = sys.argv[2]
Or, more generally:
import os
import sys
if __name__ == '__main__':
if len(sys.argv) < 2:
print('USAGE: python %s [image-paths]' % sys.argv[0])
sys.exit(1)
image_paths = sys.argv[1:]
for image_path in image_paths:
if not os.path.exists(image_path):
print('Your path does not exist, probably a typo. Try again.')
sys.exit(1)
if image_path.endswith('.img'):
print('You did not give a .img file, try running again.')
sys.exit(1)
NOTES
The first part of the answer gives you what you need to accept command-line arguments. The second part introduces a few useful concepts for dealing with them:
When running a python file as a script, the global variable __name__ is set to '__main__'. If you use the if __name__ == '__main__' clause, you can either run the python file as a script (in which case the clause executes) or import it as a module (in which case it does not). You can read more about it here.
It is customary to print a usage message and exit if the script invocation was wrong.
The variable sys.argv is set to a list of the command-line arguments, and its first item is always the script path, so len(sys.argv) < 2 means no arguments were passed. If you want exactly two arguments, you can use len(sys.argv) != 3 instead.
sys.argv[1:] contains the actual command-line arguments. If you want exactly two arguments, you can reference them via sys.argv[1] and sys.argv[2] instead.
Please don't use if os.path.exists(...)==True and if string.endswith(...)==True syntax. It is much clearer and much more Pythonic to write if os.path.exists and if string.endswith(...) instead.
Using exit() without an argument defaults to exit(0), which means the program terminated successfully. If you are exiting with an error message, you should use exit(1) (or some other non-zero value...) instead.
What you want to do is take in command line parameters, and the best way to do that is using a nifty module called argparse. I have listed below a good resource on how to install and use it.
Here is a great resource for argparse. It is a module used to take command line arguments.
You can probably use sys.argv:
import sys
first = sys.argv[1]
second = sys.argv[2]
Don't forget to check len(sys.argv) before.

Is there a way to add an already created parser as a subparser in argparse?

Normally, to add a subparser in argparse you have to do:
parser = ArgumentParser()
subparsers = parser.add_subparser()
subparser = subparsers.add_parser()
The problem I'm having is I'm trying to add another command line script, with its own parser, as a subcommand of my main script. Is there an easy way to do this?
EDIT: To clarify, I have a file script.py that looks something like this:
def initparser():
parser = argparse.ArgumentParser()
parser.add_argument('--foo')
parser.add_argument('--bar')
return parser
def func(args):
#args is a Namespace, this function does stuff with it
if __name__ == '__main__':
initparser().parse_args()
So I can run this like:
python script.py --foo --bar
I'm trying to write a module app.py that's a command line interface with several subcommands, so i can run something like:
python app.py script --foo --bar
Rather than copy and pasting all of the initparser() logic over to app.py, I'd like to be able to directly use the parser i create from initparser() as a sub-parser. Is this possible?
You could use the parents parameter
p=argparse.ArgumentParser()
s=p.add_subparsers()
ss=s.add_parser('script',parents=[initparser()],add_help=False)
p.parse_args('script --foo sst'.split())
ss is a parser that shares all the arguments defined for initparser. The add_help=False is needed on either ss or initparser so -h is not defined twice.
You might want to take a look at the shlex module as it sounds to me like you're trying to hack the ArgumentParser to do something that it wasn't actually intended to do.
Having said that, it's a little difficult to figure out a good answer without examples of what it is, exactly, that you're trying to parse.
I think your problem can be addressed by a declarative wrapper for argparse. The one I wrote is called Argh. It helps with separating definition of commands (with all arguments-related stuff) from assembling (including subparsers) and dispatching.
This is a way old question, but I wanted to throw out another alternative. And that is to think in terms of inversion of control. By this I mean the root ArgumentParser would manage the creation of the subparsers:
# root_argparser.py
from argparse import ArgumentParser, Namespace
__ARG_PARSER = ArgumentParser('My Script')
__SUBPARSERS = __ARG_PARSER.add_subparsers(dest='subcommand')
__SUBPARSERS.required = True
def get_subparser(name: str, **kwargs) -> ArgumentParser:
return __SUBPARSERS.add_parser(name, **kwargs)
def parse_args(**kwargs) -> Namespace:
return __ARG_PARSER.parse_args(**kwargs)
# my_script.py
from argparse import ArgumentParser
from root_argparse import get_subparser
__ARG_PARSER = get_subparser('script')
__ARG_PARSER.add_argument('--foo')
__ARG_PARSER.add_argument('--bar')
def do_stuff(...):
...
# main.py
from root_argparse import parse_args
import my_script
if __name__ == '__main__':
args = parse_args()
# do stuff with args
Seems to work okay from some quick testing I did.

Parsing empty options in Python

I have an application that allows you to send event data to a custom script. You simply lay out the command line arguments and assign what event data goes with what argument. The problem is that there is no real flexibility here. Every option you map out is going to be used, but not every option will necessarily have data. So when the application builds the string to send to the script, some of the arguments are blank and python's OptionParser errors out with "error: --someargument option requires an argument"
Being that there are over 200 points of data, it's not like I can write separate scripts to handle each combination of possible arguments (it would take 2^200 scripts). Is there a way to handle empty arguments in python's optionparser?
Sorry, misunderstood the question with my first answer. You can accomplish the ability to have optional arguments to command line flags use the callback action type when you define an option. Use the following function as a call back (you will likely wish to tailor to your needs) and configure it for each of the flags that can optionally receive an argument:
import optparse
def optional_arg(arg_default):
def func(option,opt_str,value,parser):
if parser.rargs and not parser.rargs[0].startswith('-'):
val=parser.rargs[0]
parser.rargs.pop(0)
else:
val=arg_default
setattr(parser.values,option.dest,val)
return func
def main(args):
parser=optparse.OptionParser()
parser.add_option('--foo',action='callback',callback=optional_arg('empty'),dest='foo')
parser.add_option('--file',action='store_true',default=False)
return parser.parse_args(args)
if __name__=='__main__':
import sys
print main(sys.argv)
Running from the command line you'll see this:
# python parser.py
(<Values at 0x8e42d8: {'foo': None, 'file': False}>, [])
# python parser.py --foo
(<Values at 0x8e42d8: {'foo': 'empty', 'file': False}>, [])
# python parser.py --foo bar
(<Values at 0x8e42d8: {'foo': 'bar', 'file': False}>, [])
Yes, there is an argument to do so when you add the option:
from optparse import OptionParser
parser = OptionParser()
parser.add_option("--SomeData",action="store", dest="TheData", default='')
Give the default argument the value you want the option to have it is to be specified but optionally have an argument.
I don't think optparse can do this. argparse is a different (non-standard) module that can handle situations like this where the options have optional values.
With optparse you have to either have to specify the option including it's value or leave out both.
Optparse already allows you to pass the empty string as an option argument. So if possible, treat the empty string as "no value". For long options, any of the following work:
my_script --opt= --anotheroption
my_script --opt='' --anotheroption
my_script --opt="" --anotheroption
my_script --opt '' --anotheroption
my_script --opt "" --anotheroption
For short-style options, you can use either of:
my_script -o '' --anotheroption
my_script -o "" --anotheroption
Caveat: this has been tested under Linux and should work the same under other Unixlike systems; Windows handles command line quoting differently and might not accept all of the variants listed above.
Mark Roddy's solution would work, but it requires attribute modification of a parser object during runtime, and has no support for alternative option formattings other than - or --.
A slightly less involved solution is to modify the sys.argv array before running optparse and insert an empty string ("") after a switch which doesn't need to have arguments.
The only constraint of this method is that you have your options default to a predictable value other than the one you are inserting into sys.argv (I chose None for the example below, but it really doesn't matter).
The following code creates an example parser and set of options, extracts an array of allowed switches from the parser (using a little bit of instance variable magic), and then iterates through sys.argv, and every time it finds an
allowed switch, it checks to see if it was given without any arguments following it . If there is no argument after a switch, the empty string will be inserted on the command
line. After altering sys.argv, the parser is invoked, and you can check for options whose values are "", and act accordingly.
#Instantiate the parser, and add some options; set the options' default values to None, or something predictable that
#can be checked later.
PARSER_DEFAULTVAL = None
parser = OptionParser(usage="%prog -[MODE] INPUT [options]")
#This method doesn't work if interspersed switches and arguments are allowed.
parser.allow_interspersed_args = False
parser.add_option("-d", "--delete", action="store", type="string", dest="to_delete", default=PARSER_DEFAULTVAL)
parser.add_option("-a", "--add", action="store", type="string", dest="to_add", default=PARSER_DEFAULTVAL)
#Build a list of allowed switches, in this case ['-d', '--delete', '-a', '--add'] so that you can check if something
#found on sys.argv is indeed a valid switch. This is trivial to make by hand in a short example, but if a program has
#a lot of options, or if you want an idiot-proof way of getting all added options without modifying a list yourself,
#this way is durable. If you are using OptionGroups, simply run the loop below with each group's option_list field.
allowed_switches = []
for opt in parser.option_list:
#Add the short (-a) and long (--add) form of each switch to the list.
allowed_switches.extend(opt._short_opts + opt._long_opts)
#Insert empty-string values into sys.argv whenever a switch without arguments is found.
for a in range(len(sys.argv)):
arg = sys.argv[a]
#Check if the sys.argv value is a switch
if arg in allowed_switches:
#Check if it doesn't have an accompanying argument (i.e. if it is followed by another switch, or if it is last
#on the command line)
if a == len(sys.argv) - 1 or argv[a + 1] in allowed_switches:
sys.argv.insert(a + 1, "")
options, args = parser.parse_args()
#If the option is present (i.e. wasn't set to the default value)
if not (options.to_delete == PARSER_DEFAULTVAL):
if options.droptables_ids_csv == "":
#The switch was not used with any arguments.
...
else:
#The switch had arguments.
...
After checking that the cp command understands e.g. --backup=simple but not --backup simple, I answered the problem like this:
import sys
from optparse import OptionParser
def add_optval_option(pog, *args, **kwargs):
if 'empty' in kwargs:
empty_val = kwargs.pop('empty')
for i in range(1, len(sys.argv)):
a = sys.argv[i]
if a in args:
sys.argv.insert(i+1, empty_val)
break
pog.add_option(*args, **kwargs)
def main(args):
parser = OptionParser()
add_optval_option(parser,
'--foo', '-f',
default='MISSING',
empty='EMPTY',
help='"EMPTY" if given without a value. Note: '
'--foo=VALUE will work; --foo VALUE will *not*!')
o, a = parser.parse_args(args)
print 'Options:'
print ' --foo/-f:', o.foo
if a[1:]:
print 'Positional arguments:'
for arg in a[1:]:
print ' ', arg
else:
print 'No positional arguments'
if __name__=='__main__':
import sys
main(sys.argv)
Self-advertisement: This is part of the opo module of my thebops package ... ;-)

Categories