Flexible UNIX command line interface with Python

Flexible UNIX command line interface with Python - python

I was wondering how to create a flexible CLI interface with Python. So far I have come up with the following:
$ cat cat.py
#!/usr/bin/env python
from sys import stdin
from fileinput import input
from argparse import ArgumentParser, FileType
def main(args):
for line in input():
print line.strip()
if __name__ == "__main__":
parser = ArgumentParser()
parser.add_argument('FILE', nargs='?', type=FileType('r'), default=stdin)
main(parser.parse_args())
This handles both stdin and file input:
$ echo 'stdin test' | ./cat.py
stdin test
$ ./cat.py file
file test
The problem is it doesn't handle multiple input or no input the way I would like:
$ ./cat.py file file
usage: cat.py [-h] [FILE]
cat.py: error: unrecognized arguments: file
$ ./cat.py
For multiple inputs it should cat the file multiple times and for no input input should ideally have same the behaviour as -h:
$ ./cat.py -h
usage: cat.py [-h] [FILE]
positional arguments:
FILE
optional arguments:
-h, --help show this help message and exit
Any ideas on creating a flexible CLI interface with Python?

Use nargs='*' to allow for 0 or more arguments:
if __name__ == "__main__":
parser = ArgumentParser()
parser.add_argument('FILE', nargs='*', type=FileType('r'), default=stdin)
main(parser.parse_args())
The help output now is:
$ bin/python cat.py -h
usage: cat.py [-h] [FILE [FILE ...]]
positional arguments:
FILE
optional arguments:
-h, --help show this help message and exit
and when no arguments are given, stdout is used.
If you want to require at least one FILE argument, use nargs='+' instead, but then the default is ignored, so you may as well drop that:
if __name__ == "__main__":
parser = ArgumentParser()
parser.add_argument('FILE', nargs='+', type=FileType('r'))
main(parser.parse_args())
Now not specifying a command-line argument gives:
$ bin/python cat.py
usage: cat.py [-h] FILE [FILE ...]
cat.py: error: too few arguments
You can always specify stdin still by passing in - as an argument:
$ echo 'hello world!' | bin/python cat.py -
hello world!

A pretty good CLI interface the handles file input, standard input, no input, file output and inplace editing:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
def main(args, help):
'''
Simple line numbering program to demonstrate CLI interface
'''
if not (select.select([sys.stdin,],[],[],0.0)[0] or args.files):
help()
return
if args.output and args.output != '-':
sys.stdout = open(args.output, 'w')
try:
for i, line in enumerate(fileinput.input(args.files, inplace=args.inplace)):
print i + 1, line.strip()
except IOError:
sys.stderr.write("%s: No such file %s\n" %
(os.path.basename(__file__), fileinput.filename()))
if __name__ == "__main__":
import os, sys, select, argparse, fileinput
parser = argparse.ArgumentParser()
parser.add_argument('files', nargs='*', help='input files')
group = parser.add_mutually_exclusive_group()
group.add_argument('-i', '--inplace', action='store_true',
help='modify files inplace')
group.add_argument('-o', '--output',
help='output file. The default is stdout')
main(parser.parse_args(), parser.print_help)
The code simply emulates nl and numbers the lines but should serve as a good skeleton for many applications.

Related

Parse variable number of commands with python argparse

I'm developing a command line tool with Python whose functionality is broken down into a number of sub-commands, and basically each one takes as arguments input and output files. The tricky part is that each command requires different number of parameters (some require no output file, some require several input files, etc).
Ideally, the interface would be called as:
./test.py ncinfo inputfile
Then, the parser would realise that the ncinfo command requires a single argument (if this does not fit the input command, it complains), and then it calls the function:
ncinfo(inputfile)
that does the actual job.
When the command requires more options, for instance
./test.py timmean inputfile outputfile
the parser would realise it, check that indeed the two arguments are given, and then then it calls:
timmean(inputfile, outputfile)
This scheme is ideally generalised for an arbitrary list of 1-argument commands, 2-argument commands, and so on.
However I'm struggling to get this behaviour with Python argparse. This is what I have so far:
#! /home/navarro/SOFTWARE/anadonda3/bin/python
import argparse
# create the top-level parser
parser = argparse.ArgumentParser()
subparsers = parser.add_subparsers()
# create the parser for the "ncinfo" command
parser_1 = subparsers.add_parser('ncinfo', help='prints out basic netCDF strcuture')
parser_1.add_argument('filein', help='the input file')
# create the parser for the "timmean" command
parser_2 = subparsers.add_parser('timmean', help='calculates temporal mean and stores it in output file')
parser_2.add_argument('filein', help='the input file')
parser_2.add_argument('fileout', help='the output file')
# parse the argument lists
parser.parse_args()
print(parser.filein)
print(parser.fileout)
But this doesn't work as expected. First, when I call the script without arguments, I get no error message telling me which options I have. Second, when I try to run the program to use ncinfo, I get an error
./test.py ncinfo testfile
Traceback (most recent call last):
File "./test.py", line 21, in <module>
print(parser.filein)
AttributeError: 'ArgumentParser' object has no attribute 'filein'
What am I doing wrong that precludes me achieving the desired behaviour? Is the use of subparsers sensible in this context?
Bonus point: is there a way to generalise the definition of the commands, so that I do not need to add manually every single command? For instance, grouping all 1-argument commands into a list, and then define the parser within a loop. This sounds reasonable, but I don't know if it is possible. Otherwise, as the number of tools grows, the parser itself is going to become hard to maintain.

import argparse
import sys
SUB_COMMANDS = [
"ncinfo",
"timmean"
]
def ncinfo(args):
print("executing: ncinfo")
print(" inputfile: %s" % args.inputfile)
def timmean(args):
print("executing: timmean")
print(" inputfile: %s" % args.inputfile)
print(" outputfile: %s" % args.outputfile)
def add_parser(subcmd, subparsers):
if subcmd == "ncinfo":
parser = subparsers.add_parser("ncinfo")
parser.add_argument("inputfile", metavar="INPUT")
parser.set_defaults(func=ncinfo)
elif subcmd == "timmean":
parser = subparsers.add_parser("timmean")
parser.add_argument("inputfile", metavar="INPUT")
parser.add_argument("outputfile", metavar="OUTPUT")
parser.set_defaults(func=timmean)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument('-o', '--common-option', action='store_true')
subparsers = parser.add_subparsers(help="sub-commands")
for cmd in SUB_COMMANDS:
add_parser(cmd, subparsers)
args = parser.parse_args(sys.argv[1:])
if args.common_option:
print("common option is active")
try:
args.func(args)
except AttributeError:
parser.error("too few arguments")
Some usage examples:
$ python test.py --help
usage: test.py [-h] [-o] {ncinfo,timmean} ...
positional arguments:
{ncinfo,timmean} sub-commands
optional arguments:
-h, --help show this help message and exit
-o, --common-option
$ python test.py ncinfo --help
usage: test.py ncinfo [-h] INPUT
positional arguments:
INPUT
optional arguments:
-h, --help show this help message and exit
$ python test.py timmean --help
usage: test.py timmean [-h] INPUT OUTPUT
positional arguments:
INPUT
OUTPUT
optional arguments:
-h, --help show this help message and exit
$ python test.py -o ncinfo foo
common option is active
executing: ncinfo
inputfile: foo
$ python test.py -o timmean foo bar
common option is active
executing: timmean
inputfile: foo
outputfile: bar

argparse command line -option after given path

I'm new to python and currently experimenting using argparse to add command line options. However my code is not working, despite looking at various online tutorials and reading up on argparse I still don't fully understand it. My problem is whenever I try to call my -option it gives me a find.py error: argument regex:
Here is my call:
./find.py ../Python -name '[0-9]*\.txt'
../Python is one directory behind my current one and has a list of files/directories. Without the -name option I print out the files with their path (this works fine) but with the -name option I want to print out files matching the regex but it won't work. Here is what I currently have:
#!/usr/bin/python2.7
import os, sys, argparse,re
from stat import *
def regex_type(s, pattern=re.compile(r"[a-f0-9A-F]")):
if not pattern.match(s):
raise argparse.ArgumentTypeError
return s
def main():
direc = sys.argv[1]
for f in os.listdir(direc):
pathname = os.path.join(direc, f)
mode = os.stat(pathname).st_mode
if S_ISREG(mode):
print pathname
parser = argparse.ArgumentParser()
parser.add_argument(
'-name', default=[sys.stdin], nargs="*")
parser.add_argument('regex', type=regex_type)
args = parser.parse_args()
if __name__ == '__main__':
main()

I tweaked your type function to be more informative:
def regex_type(s, pattern=re.compile(r"[a-f0-9A-F]")):
print('regex string', s)
if not pattern.match(s):
raise argparse.ArgumentTypeError('pattern not match')
return s
Called with
2104:~/mypy$ python2 stack50072557.py .
I get:
<director list>
('regex string', '.')
usage: stack50072557.py [-h] [-name [NAME [NAME ...]]] regex
stack50072557.py: error: argument regex: pattern not match
So it tries to pass sys.argv[1], the first string after the script name, to the regex_type function. If it fails it issues the error message and usage.
OK, the problem was the ..; I'll make a directory:
2108:~/mypy$ mkdir foo
2136:~/mypy$ python2 stack50072557.py foo
('regex string', 'foo')
Namespace(name=[<open file '<stdin>', mode 'r' at 0x7f3bea2370c0>], regex='foo')
2138:~/mypy$ python2 stack50072557.py foo -name a b c
('regex string', 'foo')
Namespace(name=['a', 'b', 'c'], regex='foo')
The strings following '-name' are allocated to that attribute. There's nothing in your code that will test them or pass them through the regex_type function. Only the first non-flag string does that.
Reading sys.argv[1] initially does not remove it from the list. It's still there for use by the parser.
I would set up a parser that uses a store_true --name argument, and 2 positionals - one for the dir and the other for regex.
After parsing check args.name. If false print the contents of args.dir. If true, perform your args.regex filter on those contents. glob might be useful.
The parser finds out what your user wants. Your own code acts on it. Especially as a beginner, it is easier and cleaner to separate the two steps.
With:
def parse(argv=None):
parser = argparse.ArgumentParser()
parser.add_argument('-n', '--name', action='store_true')
parser.add_argument('--dir', default='.')
parser.add_argument('--regex', default=r"[a-f0-9A-F]")
args = parser.parse_args(argv)
print(args)
return args
def main(argv=None):
args = parse(argv)
dirls = os.listdir(args.dir)
if args.name:
dirls = [f for f in dirls if re.match(args.regex, f)]
print(dirls)
else:
print(dirls)
I get runs like:
1005:~/mypy$ python stack50072557.py
Namespace(dir='.', name=False, regex='[a-f0-9A-F]')
['test.npz', 'stack49909128.txt', 'stack49969840.txt', 'stack49824248.py', 'test.h5', 'stack50072557.py', 'stack49963862.npy', 'Mcoo.npz', 'test_attribute.h5', 'stack49969861.py', 'stack49969605.py', 'stack49454474.py', 'Mcsr.npz', 'Mdense.npy', 'stack49859957.txt', 'stack49408644.py', 'Mdok', 'test.mat5', 'stack50012754.py', 'foo', 'test']
1007:~/mypy$ python stack50072557.py -n
Namespace(dir='.', name=True, regex='[a-f0-9A-F]')
['foo']
1007:~/mypy$ python stack50072557.py -n --regex='.*\.txt'
Namespace(dir='.', name=True, regex='.*\\.txt')
['stack49909128.txt', 'stack49969840.txt', 'stack49859957.txt']
and help:
1007:~/mypy$ python stack50072557.py -h
usage: stack50072557.py [-h] [-n] [--dir DIR] [--regex REGEX]
optional arguments:
-h, --help show this help message and exit
-n, --name
--dir DIR
--regex REGEX
If I change the dir line to:
parser.add_argument('dir', default='.')
help is now
1553:~/mypy$ python stack50072557.py -h
usage: stack50072557.py [-h] [-n] [--regex REGEX] dir
positional arguments:
dir
optional arguments:
-h, --help show this help message and exit
-n, --name
--regex REGEX
and runs are:
1704:~/mypy$ python stack50072557.py -n
usage: stack50072557.py [-h] [-n] [--regex REGEX] dir
stack50072557.py: error: too few arguments
1705:~/mypy$ python stack50072557.py . -n
Namespace(dir='.', name=True, regex='[a-f0-9A-F]')
['foo']
1705:~/mypy$ python stack50072557.py ../mypy -n --regex='.*\.txt'
Namespace(dir='../mypy', name=True, regex='.*\\.txt')
['stack49909128.txt', 'stack49969840.txt', 'stack49859957.txt']
I get the error because it now requires a directory, even it is '.'.
Note that the script still uses:
if __name__ == '__main__':
main()
My main loads the dir, and applies the regex filter to that list of names. My args.dir replaces your direc.

Get command line parameters with argparse

I´m trying to use argparse of Python but I cannot get a command line parameter.
Here my code:
DEFAULT_START_CONFIG='/tmp/config.json'
parser = argparse.ArgumentParser(description='Start the Cos service and broker for development purposes.')
parser.add_argument('-c', '--config', default=DEFAULT_START_CONFIG, action=FileAction, type=str, nargs='?',
help='start configuration json file (default:' + DEFAULT_START_CONFIG + ')')
args = parser.parse_args()
But then when I run my python script like:
./start.py -c /usr/local/config.json
Instead of getting this path it is getting the default value defined (/tmp/config.json).
print args.config ---> "/tmp/config.json"
What I´m doing wrong here?

The standard documentation doesn't mention FileAction. Instead there's a class FileType intended for type argument, not for action.
So I would write something like this:
DEFAULT_START_CONFIG='/tmp/config.json'
parser = argparse.ArgumentParser(description='Start the Cos service and broker for development purposes.')
parser.add_argument('-c', '--config', default=DEFAULT_START_CONFIG,
type=argparse.FileType('r'), help='start configuration json file')
args = parser.parse_args()
print(args)
This gives me the following:
$ python test3.py
Namespace(config=<open file '/tmp/config.json', mode 'r' at 0x7fd758148540>)
$ python test3.py -c
usage: test3.py [-h] [-c CONFIG]
test3.py: error: argument -c/--config: expected one argument
$ python test3.py -c some.json
usage: test3.py [-h] [-c CONFIG]
test3.py: error: argument -c/--config: can't open 'some.json': [Errno 2] No such file or directory: 'some.json'
$ touch existing.json
$ python test3.py -c existing.json
Namespace(config=<open file 'existing.json', mode 'r' at 0x7f93e27a0540>)
You may subclass argparse.FileType to something like JsonROFileType which would check if the supplied file is actually a JSON of expected format etc, but this seems to be out of the scope of the question.

Usage message when using argparse.REMAINDER

I have a Python program that takes as its (only) positional command-line argument one or more file path expressions. I'm using argparse for the CL parsing, and argparse.REMAINDER for the variable that contains the file path(s). See code below:
import argparse
import sys
# Create parser
parser = argparse.ArgumentParser(
description="My test program")
def parseCommandLine():
# Add arguments
parser.add_argument('filesIn',
action="store",
type=str,
nargs=argparse.REMAINDER,
help="input file(s)")
# Parse arguments
args = parser.parse_args()
return(args)
def main():
# Get input from command line
args = parseCommandLine()
# Input files
filesIn = args.filesIn
# Print help message and exit if filesIn is empty
if len(filesIn) == 0:
parser.print_help()
sys.exit()
# Do something
print(filesIn)
if __name__ == "__main__":
main()
Now, when a user runs the script without any arguments, this results in the following help message:
usage: test.py [-h] ...
Where ... represents the positional input. From a user's perspective it would be more informative if the name of the variable (filesIn) was displayed here instead. Especially because typing test.py -h results in this:
usage: test.py [-h] ...
My test program
positional arguments:
filesIn input file(s)
I.e. the usage line displays ... but then in the list of positional arguments filesIn is used.
So my question is whether there's some easy way to change this (i.e. always display filesIn)?

Don't use argparse.REMAINDER here. You are not gathering all remaining arguments, you are trying to take filenames.
Use '+' instead to capture all remaining arguments as filenames and you need at least one:
parser.add_argument('filesIn',
action="store",
type=str,
nargs='+',
help="input file(s)")
This produces better help output:
$ bin/python test.py
usage: test.py [-h] filesIn [filesIn ...]
test.py: error: too few arguments
$ bin/python test.py -h
usage: test.py [-h] filesIn [filesIn ...]
My test program
positional arguments:
filesIn input file(s)
optional arguments:
-h, --help show this help message and exit

Consistent way to redirect both stdin & stdout to files in python using optparse

I've got a dozen programs that can accept input via stdin or an option, and I'd like to implement the same features in a similar way for the output.
The optparse code looks like this:
parser.add_option('-f', '--file',
default='-',
help='Specifies the input file. The default is stdin.')
parser.add_option('-o', '--output',
default='-',
help='Specifies the output file. The default is stdout.')
The rest of the applicable code looks like this:
if opts.filename == '-':
infile = sys.stdin
else:
infile = open(opts.filename, "r")
if opts.output == '-':
outfile = sys.stdout
else:
outfile = open(opts.output, "w")
This code works fine and I like its simplicity - but I haven't been able to find a reference to anyone using a default value of '-' for output to indicate stdout. Is this a good consistent solution or am I overlooking something better or more expected?

For input files you could use fileinput module. It follows common convention for input files: if no files given or filename is '-' it reads stdin, otherwise it reads from files given at a command-line.
There is no need in -f and --file options. If your program always requires an input file then it is not an option.
-o and --output is used to specify the output file name in various programs.
optparse
#!/usr/bin/env python
import fileinput
import sys
from optparse import OptionParser
parser = OptionParser()
parser.add_option('-o', '--output',
help='Specifies the output file. The default is stdout.')
options, files = parser.parse_args()
if options.output and options.output != '-':
sys.stdout = open(options.output, 'w')
for line in fileinput.input(files):
process(line)
argparse
argparse module allows you to specify explicitly files as arguments:
#!/usr/bin/env python
import fileinput
import sys
from argparse import ArgumentParser
parser = ArgumentParser()
parser.add_argument('files', nargs='*', help='specify input files')
group = parser.add_mutually_exclusive_group()
group.add_argument('-o', '--output',
help='specify the output file. The default is stdout')
group.add_argument('-i', '--inplace', action='store_true',
help='modify files inplace')
args = parser.parse_args()
if args.output and args.output != '-':
sys.stdout = open(args.output, 'w')
for line in fileinput.input(args.files, inplace=args.inplace):
process(line)
Note: I've added --inplace option in the second example:
$ python util-argparse.py --help
usage: util-argparse.py [-h] [-o OUTPUT | -i] [files [files ...]]
positional arguments:
files specify input files
optional arguments:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
specify the output file. The default is stdout
-i, --inplace modify files inplace

If you can use argparse (i.e. Python 2.7+), it has built-in support for what you want: straight from argparse doc
The FileType factory creates objects that can be passed to the type argument of ArgumentParser.add_argument(). Arguments that have FileType objects as their type will open command-line arguments […] FileType objects understand the pseudo-argument '-' and automatically convert this into sys.stdin for readable FileType objects and sys.stdout for writable FileType objects.
So my advice is to simply use
import sys
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('file', type=argparse.FileType('r'),
help="Specifies the input file")
parser.add_argument('output', type=argparse.FileType('w'),
help="Specifies the output file")
args = parser.parse_args(sys.argv[1:])
# Here you can use your files
text = args.file.read()
args.output.write(text)
# … and so on
Then you can do
> python spam.py file output
To read from file and output to output, or
> echo "Ni!" | python spam.py - output
to read "Ni!" and output to output, or
> python spam.py file -
…
And it's good, since using - for the relevant stream is a convention that a lot of programs use. If you want to point it out, add it to the help strings.
parser.add_argument('file', type=argparse.FileType('r'),
help="Specifies the input file, '-' for standard input")
For reference, the usage message will be
> python spam.py -h
usage: [-h] file output
positional arguments:
file Specifies the input file, '-' for standard input
output Specifies the output file, '-' for standard output
optional arguments:
-h, --help show this help message and exit

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Flexible UNIX command line interface with Python - python

Related

Parse variable number of commands with python argparse

argparse command line -option after given path

Get command line parameters with argparse

Usage message when using argparse.REMAINDER

Consistent way to redirect both stdin & stdout to files in python using optparse

Categories

Resources