I'm developing a management script that does a fairly large amount of work via a plethora of command-line options. The first few iterations of the script have used optparse to collect user input and then just run down the page, testing the value of each option in the appropriate order, and doing the action if necessary. This has resulted in a jungle of code that's really hard to read and maintain.
I'm looking for something better.
My hope is to have a system where I can write functions in more or less normal python fashion, and then when the script is run, have options (and help text) generated from my functions, parsed, and executed in the appropriate order. Additionally, I'd REALLY like to be able to build django-style sub-command interfaces, where myscript.py install works completely separately from myscript.py remove (separate options, help, etc.)
I've found simon willison's optfunc and it does a lot of this, but seems to just miss the mark — I want to write each OPTION as a function, rather than try to compress the whole option set into a huge string of options.
I imagine an architecture involving a set of classes for major functions, and each defined method of the class corresponding to a particular option in the command line. This structure provides the advantage of having each option reside near the functional code it modifies, easing maintenance. The thing I don't know quite how to deal with is the ordering of the commands, since the ordering of class methods is not deterministic.
Before I go reinventing the wheel: Are there any other existing bits of code that behave similarly? Other things that would be easy to modify? Asking the question has clarified my own thinking on what would be nice, but feedback on why this is a terrible idea, or how it should work would be welcome.
Don't waste time on "introspection".
Each "Command" or "Option" is an object with two sets of method functions or attributes.
Provide setup information to optparse.
Actually do the work.
Here's the superclass for all commands
class Command( object ):
name= "name"
def setup_opts( self, parser ):
"""Add any options to the parser that this command needs."""
pass
def execute( self, context, options, args ):
"""Execute the command in some application context with some options and args."""
raise NotImplemented
You create sublcasses for Install and Remove and every other command you need.
Your overall application looks something like this.
commands = [
Install(),
Remove(),
]
def main():
parser= optparse.OptionParser()
for c in commands:
c.setup_opts( parser )
options, args = parser.parse()
command= None
for c in commands:
if c.name.startswith(args[0].lower()):
command= c
break
if command:
status= command.execute( context, options, args[1:] )
else:
logger.error( "Command %r is unknown", args[0] )
status= 2
sys.exit( status )
The WSGI library werkzeug provides Management Script Utilities which may do what you want, or at least give you a hint how to do the introspection yourself.
from werkzeug import script
# actions go here
def action_test():
"sample with no args"
pass
def action_foo(name=2, value="test"):
"do some foo"
pass
if __name__ == '__main__':
script.run()
Which will generate the following help message:
$ python /tmp/test.py --help
usage: test.py <action> [<options>]
test.py --help
actions:
foo:
do some foo
--name integer 2
--value string test
test:
sample with no args
An action is a function in the same module starting with "action_" which takes a number of arguments where every argument has a default. The type of the default value specifies the type of the argument.
Arguments can then be passed by position or using --name=value from the shell.
Related
I'm in the works of transferring an open source project's CLI from argparse to click. Currently the library allows for the following CLI usage patterns:
manim file.py Scene1 Scene2 -p -ql
but also for subcommands (some with subcommands of their own), like so:
manim plugins --list
manim cfg write --level cwd
... and more.
Thus, the manim keyword is both a way to access subcommands, and more importantly, a command itself with a positional arg (the python file). In argparse, determining if the text following the manim keyword is a subcommand, or positional arg, is a matter of reading sys.argv[2] and determining whether this matches one of the available subcommands, else it's a positional argument. The result is a lot of effort to add subcommands and lots of boilerplate code... hence the motion to transition to click; however, this particular usage pattern doesn't seem to come out of the box in click.
In other words, the following code:
#click.group()
#click.argument('file')
def main(file):
print('main',file)
#main.command()
#click.option('--opt')
def subcommand1(opt):
print("sub1",opt)
presents two issues:
requires the main Group's positional argument to be satisfied in order to run any subcommands (i.e. main file.py subcommand1 and not main subcommand1)
it won't execute the Group + argument without a subcommand (i.e. main file.py is not possible without specifying subommand1 to the Group).
After further research, I've solved the second issue using the click Group parameter, invoke_without_command
#click.group(invoke_without_command=True)
#click.argument('file')
def main(file):
print('main',file)
#main.command()
#click.option('--opt')
def subcommand1(opt):
print("sub1",opt)
but I still face issue with the 1st point -- the general ability to opt out of using the positional arg and instead invoke the subcommand. This StackOverflow article answered by Stephen Raunch seems relevant; however, it's only for the help option and not the the more general use case of arguments and options.
How can I allow for arbitrary nesting of subcommands, like a Group, but still provide the capability to execute that Group like a #click.command() with positional arguments? If this isn't a recommended pattern, what would you recommend instead?
In doing some more research on the topic, I came across this stackoverflow article which led to the creation of this custom group:
import click
from click.decorators import pass_context
class SkipArg(click.Group):
def parse_args(self, ctx, args):
if args[0] in self.commands:
if len(args) == 1 or args[1] not in self.commands:
# This condition needs updating for multiple positional arguments
args.insert(0, '')
super(SkipArg, self).parse_args(ctx, args)
#click.group(cls=SkipArg, invoke_without_command=True)
#click.argument('file')
#pass_context
def main(ctx, file):
print('main',ctx,file)
#main.command()
#click.option('--opt')
def subcommand1(opt):
print("sub1",opt)
This code meets the criteria of using click to have a command with positional arguments nested with more subcommands. The logic of the SkipArg will have to be updated for multiple positional arguments in the case of where I'll be using click, but is enough as a demonstration.
How it meets the criteria:
The main Group functions as a command since it is executable using invoke_without_command.
The positional argument is not required to execute subcommand1 because the class SkipArg inserts '' into the front of the arguments list. This is a bit of hard coded logic that applies empty filler text for the [FILE] positional argument when the zeroth argument matches one of the subcommands.
I am new to python & fabric.
I have a python module module1.py which would like to take in a command parameter, we use argparse inside the module1.py to require a command line parameter.
But we are trying to run the whole program through fabric, if I directly specify the command line when running through fab, I got
mycode.py: error: argument --config_yaml is required
How could I pass the argument through fab?
Thanks!
From what i can see from here:
https://github.com/fabric/fabric/blob/master/fabric/main.py#L340
https://github.com/fabric/fabric/blob/master/fabric/main.py#L619
You CAN'T do it. You can't add things to env_options before main.py runs, your code inside fabfile.py will run after main() has already been processed.
You can however do this:
Rename fabfile.py -> whatever you want as long as its not fabric.py. I called mine fabricc.py
from fabric.state import env_options, make_option
env_options.append(make_option('--myconfig', dest='config_file', default='default.ini'))
from fabric import main
from fabric.api import task, env
#task
def do_something():
print('config file: {}'.format(env.config_file))
if __name__ == '__main__':
main.find_fabfile = lambda x: __file__
main.main()
now run it:
$ python fabricc.py do_something
config file: default.ini
Done.
or..
$ python fabricc.py do_something --myconfig=somethinelse.ini
config file: somethinelse.ini
Done.
Word of warning DO NOT call it --config -- this is a builtin param.
With this you still enjoy everything you still love about fabric and nothing has changed.
$ python fabricc.py do_something --help
Usage: fab [options] <command>[:arg1,arg2=val2,host=foo,hosts='h1;h2',...] ...
Options:
-h, --help show this help message and exit
-d NAME, --display=NAME
...
-z INT, --pool-size=INT
number of concurrent processes to use in parallel mode
--myconfig=CONFIG_FILE
PS. i dont suggest you do the fabric override to handle it manually. Im just showing you it can be done. (the reason i dont condone this is: compatibility. If the version changes tomorrow this might break) best to use the product how the author meant it to be used.
====================================================================
Also, fabric is meant to be a fancy orm builder -- for lack of a better word. You cannot have command line arguments on a per-task-basis, it wasnt design to work like that. BUT it was designed to take in task-function arguments:
#task
def do_something(file=None):
print('config file: {}'.format(file or 'default.ini'))
$ fab dosomething:test
config file: override.ini
Done.
and that is what fabric was created to do.
its meant to be used like this:
#task
def environment(box_env=None):
...
#task
def deploy(branch='master'):
...
#task
def provision(recreate_if_existing=False, start_nignx=True):
...
fab evironment:dev deploy:development
fab evironment:dev provision:True,False
all together.
fab evironment:dev provision deploy:development
I know argparse, but not fabric. From the error is looks like your script defines a '--config_yaml' argparse argument (and makes it 'required'). But fabric apparently also uses that argument name.
I don't know if fabric also uses argparse. But it is common for programs like that to strip off the commandline arguments that it expects, and pass the rest on to your code.
Do you need to use --config_yaml in your script? And why is it set to required=True? argparse allows you to specify a required parameter, but ideally flagged arguments like this are optional. If not given they should have a reasonable default.
I see a --config=... argument in the fabric API, but not a --config_yaml.
I wonder if this section about per-task arguments is relevant
http://docs.fabfile.org/en/1.10/usage/fab.html#per-task-arguments
It sounds like you need to add the task name to the argument. fabric doesn't just pass all unknown arguments to the task (which is what I was assuming above). But I don't have fabric installed, so can't test it.
Question can be related to Use python subprocess module like a command line simulator
I have written some infrastructure code called my_shell to which you can pass shell commands of my application that looks like this
class ApplicationTestShell(object):
def __init__(self):
'''
Constructor
'''
self.play_ground_dir = "/var/tmp/MyAppDir"
ensure_dir_exists_and_empty(self.play_ground_dir)
def execute_command(self, command, on_success = None, on_failure = None):
p = create_shell_process(self, self.play_ground_dir)
sout, serr = p.communicate(input = command)
if p.returncode == 0:
on_success(sout)
else:
on_failure(serr)
def create_shell_process(self, cwd):
return Popen("/bin/bash", env= {WHAT DO I DO HERE?},cwd = test_dir, stdout=PIPE, stderr=PIPE, stdin=PIPE)
The interesting bit to me here is the env parameter. Python expects like a 'map' datastructure of all environment variable. My application requires several variables exported and set. The script for setting and exporting is generated by running say '/bin/appload myapp' (Assume appload is always available on the path). What I do currently
is when I call p.communicate I do the following
p.communicate(input = "eval `/bin/appload myapp`;" + command)
So basically before running the command I call the infrastructure setup.
Is there any way to do this in a better fashion in Python. I somehow want to push the eval /bin/appload part to the env parameter on the Popen class OR as part of the shell creation process.
What are the problems with my current implementation? (I feel it is hacky but I may be wrong)
It depends on how /bin/appload myapp works. If it only guarantees that it will output bash syntax, then parsing that output in Python in order to construct the environment object there is almost certainly more trouble than it's worth (you might need to support parameter and variable expansion, subshells, process substitution, etc, etc). On the other hand, if you are sure that /bin/appload myapp will only ever output lines of the form "VARIABLENAME=someword", then that's pretty trivial to parse in Python and you could move it into your Python code if you like.
There are an awful lot of different directions you could go with these requirements; you could capture the output of appload myapp into a tempfile and set the subprocess's $BASH_ENV to that filename; that would cause the shell to source your environment setup before running your command in a way that some might consider cleaner. You could give your command (with the eval-ing prefix) as the first argument to Popen and pass shell=True, and let Popen do the bash invocation on its own (setting $SHELL explicitly to bash if necessary). You could use bash's -c option to specify the code to run on the command line rather than via stdin. You could have a multi-tiered approach by invoking a shell from Python which eval's the appload myapp environment and then exec's another shell underneath it, so that the first doesn't show up in ps listings and the command given to create_shell_process has the shell all to itself (although that shouldn't really matter). You could do a lot of things, depending on what your concerns are with respect to how the shell is invoked, how it looks in ps listings, whether you want your command to still be run if the appload myapp output produces an error when eval'd, etc. But for a general solution, I think what you have is perfectly fine.
I don't see any real problems with the implementation, besides cosmetic things or minor things that probably only came from copying and pasting the code: create_shell_process doesn't use its cwd parameter, and the on_success and on_failure parameters look like they're optional but the defaults will break things (you can't call None).
I am working on a python Command-Line-Interface program, and I find it boring when doing testings, for example, here is the help information of the program:
usage: pyconv [-h] [-f ENCODING] [-t ENCODING] [-o file_path] file_path
Convert text file from one encoding to another.
positional arguments:
file_path
optional arguments:
-h, --help show this help message and exit
-f ENCODING, --from ENCODING
Encoding of source file
-t ENCODING, --to ENCODING
Encoding you want
-o file_path, --output file_path
Output file path
When I made changes on the program and want to test something, I must open a terminal,
type the command(with options and arguments), type enter, and see if any error occurs
while running. If error really occurs, I must go back to the editor and check the code
from top to end, guessing where the bug positions, make small changes, write print lines,
return to the terminal, run command again...
Recursively.
So my question is, what is the best way to do testing with CLI program, can it be as easy
as unit testing with normal python scripts?
I think it's perfectly fine to test functionally on a whole-program level. It's still possible to test one aspect/option per test. This way you can be sure that the program really works as a whole. Writing unit-tests usually means that you get to execute your tests quicker and that failures are usually easier to interpret/understand. But unit-tests are typically more tied to the program structure, requiring more refactoring effort when you internally change things.
Anyway, using py.test, here is a little example for testing a latin1 to utf8 conversion for pyconv::
# content of test_pyconv.py
import pytest
# we reuse a bit of pytest's own testing machinery, this should eventually come
# from a separatedly installable pytest-cli plugin.
pytest_plugins = ["pytester"]
#pytest.fixture
def run(testdir):
def do_run(*args):
args = ["pyconv"] + list(args)
return testdir._run(*args)
return do_run
def test_pyconv_latin1_to_utf8(tmpdir, run):
input = tmpdir.join("example.txt")
content = unicode("\xc3\xa4\xc3\xb6", "latin1")
with input.open("wb") as f:
f.write(content.encode("latin1"))
output = tmpdir.join("example.txt.utf8")
result = run("-flatin1", "-tutf8", input, "-o", output)
assert result.ret == 0
with output.open("rb") as f:
newcontent = f.read()
assert content.encode("utf8") == newcontent
After installing pytest ("pip install pytest") you can run it like this::
$ py.test test_pyconv.py
=========================== test session starts ============================
platform linux2 -- Python 2.7.3 -- pytest-2.4.5dev1
collected 1 items
test_pyconv.py .
========================= 1 passed in 0.40 seconds =========================
The example reuses some internal machinery of pytest's own testing by leveraging pytest's fixture mechanism, see http://pytest.org/latest/fixture.html. If you forget about the details for a moment, you can just work from the fact that "run" and "tmpdir" are provided for helping you to prepare and run tests. If you want to play, you can try to insert a failing assert-statement or simply "assert 0" and then look at the traceback or issue "py.test --pdb" to enter a python prompt.
Start from the user interface with functional tests and work down towards unit tests. It can feel difficult, especially when you use the argparse module or the click package, which take control of the application entry point.
The cli-test-helpers Python package has examples and helper functions (context managers) for a holistic approach on writing tests for your CLI. It's a simple idea, and one that works perfectly with TDD:
Start with functional tests (to ensure your user interface definition) and
Work towards unit tests (to ensure your implementation contracts)
Functional tests
NOTE: I assume you develop code that is deployed with a setup.py file or is run as a module (-m).
Is the entrypoint script installed? (tests the configuration in your setup.py)
Can this package be run as a Python module? (i.e. without having to be installed)
Is command XYZ available? etc. Cover your entire CLI usage here!
Those tests are simplistic: They run the shell command you would enter in the terminal, e.g.
def test_entrypoint():
exit_status = os.system('foobar --help')
assert exit_status == 0
Note the trick to use a non-destructive operation (e.g. --help or --version) as we can't mock anything with this approach.
Towards unit tests
To test single aspects inside the application you will need to mimic things like command line arguments and maybe environment variables. You will also need to catch the exiting of your script to avoid the tests to fail for SystemExit exceptions.
Example with ArgvContext to mimic command line arguments:
#patch('foobar.command.baz')
def test_cli_command(mock_command):
"""Is the correct code called when invoked via the CLI?"""
with ArgvContext('foobar', 'baz'), pytest.raises(SystemExit):
foobar.cli.main()
assert mock_command.called
Note that we mock the function that we want our CLI framework (click in this example) to call, and that we catch SystemExit that the framework naturally raises. The context managers are provided by cli-test-helpers and pytest.
Unit tests
The rest is business as usual. With the above two strategies we've overcome the control a CLI framework may have taken away from us. The rest is usual unit testing. TDD-style hopefully.
Disclosure: I am the author of the cli-test-helpers Python package.
So my question is, what is the best way to do testing with CLI program, can it be as easy as unit testing with normal python scripts?
The only difference is that when you run Python module as a script, its __name__ attribute is set to '__main__'. So generally, if you intend to run your script from command line it should have following form:
import sys
# function and class definitions, etc.
# ...
def foo(arg):
pass
def main():
"""Entry point to the script"""
# Do parsing of command line arguments and other stuff here. And then
# make calls to whatever functions and classes that are defined in your
# module. For example:
foo(sys.argv[1])
if __name__ == '__main__':
main()
Now there is no difference, how you would use it: as a script or as a module. So inside your unit-testing code you can just import foo function, call it and make any assertions you want.
Maybe too little too late,
but you can always use
import os.system
result = os.system(<'Insert your command with options here'>
assert(0 == result)
In that way, you can run your program as if it was from command line, and evaluate the exit code.
(Update after I studied pytest)
You can also use capsys.
(from running pytest --fixtures)
capsys
Enable text capturing of writes to sys.stdout and sys.stderr.
The captured output is made available via ``capsys.readouterr()`` method
calls, which return a ``(out, err)`` namedtuple.
``out`` and ``err`` will be ``text`` objects.
This isn't for Python specifically, but what I do to test command-line scripts is to run them with various predetermined inputs and options and store the correct output in a file. Then, to test them when I make changes, I simply run the new script and pipe the output into diff correct_output -. If the files are the same, it outputs nothing. If they're different, it shows you where. This will only work if you are on Linux or OS X; on Windows, you will have to get MSYS.
Example:
python mycliprogram --someoption "some input" | diff correct_output -
To make it even easier, you can add all these test runs to your 'make test' Makefile target, which I assume you already have. ;)
If you are running many of these at once, you could make it a little more obvious where each one ends by adding a fail tag:
python mycliprogram --someoption "some input" | diff correct_output - || tput setaf 1 && echo "FAILED"
The short answer is yes, you can use unit tests, and should. If your code is well structured, it should be quite easy to test each component separately, and if you need to to can always mock sys.argv to simulate running it with different arguments.
pytest-console-scripts is a Pytest plugin for testing python scripts installed via console_scripts entry point of setup.py.
For Python 3.5+, you can use the simpler subprocess.run to call your CLI command from your test.
Using pytest:
import subprocess
def test_command__works_properly():
try:
result = subprocess.run(['command', '--argument', 'value'], check=True, capture_output=True, text=True)
except subprocess.CalledProcessError as error:
print(error.stdout)
print(error.stderr)
raise error
The output can be accessed via result.stdout, result.stderr, and result.returncode if needed.
The check parameter causes an exception to be raised if an error occurs. Note Python 3.7+ is required for the capture_output and text parameters, which simplify capturing and reading stdout/stderr.
Given that you are explicitly asking about testing for a command line application, I believe that you are aware of unit-testing tools in python and that you are actually looking for a tool to automate end-to-end tests of a command line tool. There are a couple of tools out there that are specifically designed for that. If you are looking for something that's pip-installable, I would recommend cram. It integrates well with the rest of the python environment (e.g. through a pytest extension) and it's quite easy to use:
Simply write the commands you want to run prepended with $ and the expected output prepended with . For example, the following would be a valid cram test:
$ echo Hello
Hello
By having four spaces in front of expected output and two in front of the test, you can actually use these tests to also write documentation. More on that on the website.
You can use standard unittest module:
# python -m unittest <test module>
or use nose as a testing framework. Just write classic unittest files in separate directory and run:
# nosetests <test modules directory>
Writing unittests is easy. Just follow online manual for unittesting
I would not test the program as a whole this is not a good test strategy and may not actually catch the actual spot of the error. The CLI interface is just front end to an API. You test the API via your unit tests and then when you make a change to a specific part you have a test case to exercise that change.
So, restructure your application so that you test the API and not the application it self. But, you can have a functional test that actually does run the full application and checks that the output is correct.
In short, yes testing the code is the same as testing any other code, but you must test the individual parts rather than their combination as a whole to ensure that your changes do not break everything.
When I'm writing shell scripts, I often find myself spending most of my time (especially when debugging) dealing with argument processing. Many scripts I write or maintain are easily more than 80% input parsing and sanitization. I compare that to my Python scripts, where argparse handles most of the grunt work for me, and lets me easily construct complex option structures and sanitization / string parsing behavior.
I'd love, therefore, to be able to have Python do this heavy lifting, and then get these simplified and sanitized values in my shell script, without needing to worry any further about the arguments the user specified.
To give a specific example, many of the shell scripts where I work have been defined to accept their arguments in a specific order. You can call start_server.sh --server myserver --port 80 but start_server.sh --port 80 --server myserver fails with You must specify a server to start. - it makes the parsing code a lot simpler, but it's hardly intuitive.
So a first pass solution could be something as simple as having Python take in the arguments, sort them (keeping their parameters next to them) and returning the sorted arguments. So the shell script still does some parsing and sanitization, but the user can input much more arbitrary content than the shell script natively accepts, something like:
# script.sh -o -aR --dir /tmp/test --verbose
#!/bin/bash
args=$(order.py "$#")
# args is set to "-a --dir /tmp/test -o -R --verbose"
# simpler processing now that we can guarantee the order of parameters
There's some obvious limitations here, notably that parse.py can't distinguish between a final option with an argument and the start of indexed arguments, but that doesn't seem that terrible.
So here's my question: 1) Is there any existing (Python preferably) utility to enable CLI parsing by something more powerful than bash, which can then be accessed by the rest of my bash script after sanitization, or 2) Has anyone done this before? Are there issues or pitfalls or better solutions I'm not aware of? Care to share your implementation?
One (very half-baked) idea:
#!/bin/bash
# Some sort of simple syntax to describe to Python what arguments to accept
opts='
"a", "append", boolean, help="Append to existing file"
"dir", str, help="Directory to run from"
"o", "overwrite", boolean, help="Overwrite duplicates"
"R", "recurse", boolean, help="Recurse into subdirectories"
"v", "verbose", boolean, help="Print additional information"
'
# Takes in CLI arguments and outputs a sanitized structure (JSON?) or fails
p=$(parse.py "Runs complex_function with nice argument parsing" "$opts" "$#")
if [ $? -ne 0 ]; exit 1; fi # while parse outputs usage to stderr
# Takes the sanitized structure and an argument to get
append=$(arg.py "$p" append)
overwrite=$(arg.py "$p" overwrite)
recurse=$(arg.py "$p" recurse)
verbose=$(arg.py "$p" verbose)
cd $(python arg.py "$p" dir)
complex_function $append $overwrite $recurse $verbose
Two lines of code, along with concise descriptions of the arguments to expect, and we're on to the actual script behavior. Maybe I'm crazy, but that seems way nicer than what I feel like I have to do now.
I've seen Parsing shell script arguments and things like this wiki page on easy CLI argument parsing, but many of these patterns feel clunky and error prone, and I dislike having to re-implement them every time I write a shell script, especially when Python, Java, etc. have such nice argument processing libraries.
You could potentially take advantage of associative arrays in bash to help obtain your goal.
declare -A opts=($(getopts.py $#))
cd ${opts[dir]}
complex_function ${opts[append]} ${opts[overwrite]} ${opts[recurse]} \
${opts[verbose]} ${opts[args]}
To make this work, getopts.py should be a python script that parses and sanitizes your arguments. It should print a string like the following:
[dir]=/tmp
[append]=foo
[overwrite]=bar
[recurse]=baz
[verbose]=fizzbuzz
[args]="a b c d"
You could set aside values for checking that the options were able to be properly parsed and sanitized as well.
Returned from getopts.py:
[__error__]=true
Added to bash script:
if ${opts[__error__]}; then
exit 1
fi
If you would rather work with the exit code from getopts.py, you could play with eval:
getopts=$(getopts.py $#) || exit 1
eval declare -A opts=($getopts)
Alternatively:
getopts=$(getopts.py $#)
if [[ $? -ne 0 ]]; then
exit 1;
fi
eval declare -A opts=($getopts)
Having the very same needs, I ended up writing an optparse-inspired parser for bash (which actually uses python internally); you can find it here:
https://github.com/carlobaldassi/bash_optparse
See the README at the bottom for a quick explanation. You may want to check out a simple example at:
https://github.com/carlobaldassi/bash_optparse/blob/master/doc/example_script_simple
From my experience, it's quite robust (I'm super-paranoid), feature-rich, etc., and I'm using it heavily in my scripts. I hope it may be useful to others. Feedback/contributions welcome.
Edit: I haven't used it (yet), but if I were posting this answer today I would probably recommend https://github.com/docopt/docopts instead of a custom approach like the one described below.
I've put together a short Python script that does most of what I want. I'm not convinced it's production quality yet (notably error handling is lacking), but it's better than nothing. I'd welcome any feedback.
It takes advantage of the set builtin to re-assign the positional arguments, allowing the remainder of the script to still handle them as desired.
bashparse.py
#!/usr/bin/env python
import optparse, sys
from pipes import quote
'''
Uses Python's optparse library to simplify command argument parsing.
Takes in a set of optparse arguments, separated by newlines, followed by command line arguments, as argv[2] and argv[3:]
and outputs a series of bash commands to populate associated variables.
'''
class _ThrowParser(optparse.OptionParser):
def error(self, msg):
"""Overrides optparse's default error handling
and instead raises an exception which will be caught upstream
"""
raise optparse.OptParseError(msg)
def gen_parser(usage, opts_ls):
'''Takes a list of strings which can be used as the parameters to optparse's add_option function.
Returns a parser object able to parse those options
'''
parser = _ThrowParser(usage=usage)
for opts in opts_ls:
if opts:
# yes, I know it's evil, but it's easy
eval('parser.add_option(%s)' % opts)
return parser
def print_bash(opts, args):
'''Takes the result of optparse and outputs commands to update a shell'''
for opt, val in opts.items():
if val:
print('%s=%s' % (opt, quote(val)))
print("set -- %s" % " ".join(quote(a) for a in args))
if __name__ == "__main__":
if len(sys.argv) < 2:
sys.stderr.write("Needs at least a usage string and a set of options to parse")
sys.exit(2)
parser = gen_parser(sys.argv[1], sys.argv[2].split('\n'))
(opts, args) = parser.parse_args(sys.argv[3:])
print_bash(opts.__dict__, args)
Example usage:
#!/bin/bash
usage="[-f FILENAME] [-t|--truncate] [ARGS...]"
opts='
"-f"
"-t", "--truncate",action="store_true"
'
echo "$(./bashparse.py "$usage" "$opts" "$#")"
eval "$(./bashparse.py "$usage" "$opts" "$#")"
echo
echo OUTPUT
echo $f
echo $#
echo $0 $2
Which, if run as: ./run.sh one -f 'a_filename.txt' "two' still two" three outputs the following (notice that the internal positional variables are still correct):
f=a_filename.txt
set -- one 'two'"'"' still two' three
OUTPUT
a_filename.txt
one two' still two three
./run.sh two' still two
Disregarding the debugging output, you're looking at approximately four lines to construct a powerful argument parser. Thoughts?
The original premise of my question assumes that delegating to Python is the right approach to simplify argument parsing. If we drop the language requirement we can actually do a decent job* in Bash, using getopts and a little eval magic:
main() {
local _usage='foo [-a] [-b] [-f val] [-v val] [args ...]'
eval "$(parse_opts 'f:v:ab')"
echo "f=$f v=$v a=$a b=$b -- $#: $*"
}
main "$#"
The implementation of parse_opts is in this gist, but the basic approach is to convert options into local variables which can then be handled like normal. All the standard getopts boilerplate is hidden away, and error handling works as expected.
Because it uses local variables within a function, parse_opts is not just useful for command line arguments, it can be used with any function in your script.
* I say "decent job" because Bash's getopts is a fairly limited parser and only supports single-letter options. Elegant, expressive CLIs are still better implemented in other languages like Python. But for reasonably small functions or scripts this provides a nice middle ground without adding too much complexity or bloat.