messaging for command line programs - python

I tend to write a lot of command line utility programs and was wondering if
there is a standard way of messaging the user in Python. Specifically, I would like to print error and warning messages, as well as other more conversational output in a manner that is consistent with Unix conventions. I could produce these myself using the built-in print function, but the messages have a uniform structure so it seems like it would be useful to have a package to handle this for me.
For example, for commands that you run directly in the command line you might
get messages like this:
This is normal output.
error: no files given.
error: parse.c: no such file or directory.
error: parse.c:7:16: syntax error.
warning: /usr/lib64/python2.7/site-packages/simplejson:
not found, skipping.
If the commands might be run in a script or pipeline, they should include their name:
grep: /usr/dict/words: no such file or directory.
It would be nice if could handle levels of verbosity.
These things are all relatively simple in concept, but can result in a lot of
extra conditionals and complexity for each print statement.
I have looked at the logging facility in Python, but it seems overly complicated and more suited for daemons than command line utilities.

I can recommend Inform. It is the only package I have seen that seems to address this need. It provides a variety of print functions that print in different circumstances or with different headers. For example:
log() -- prints to log file, no header
comment() -- prints if verbose, no header
display() -- prints if not quiet, no header
output() -- always prints, no header
warning() -- always prints with warning header
error() -- always prints with error header
fatal() -- always prints with error header, terminates program.
Inform refers to these functions as 'informants'. Informants are very similar to the Python print function in that they take any number of arguments and builds the message by joining them together. It also allows you to specify a culprit, which is added to the front of the message.
For example, here is a simple search and replace program written using Inform.
#!/usr/bin/env python3
"""
Replace a string in one or more files.
Usage:
replace [options] <target> <replacement> <file>...
Options:
-v, --verbose indicate whether file is changed
"""
from docopt import docopt
from inform import Inform, comment, error, os_error
from pathlib import Path
# read command line
cmdline = docopt(__doc__)
target = cmdline['<target>']
replacement = cmdline['<replacement>']
filenames = cmdline['<file>']
Inform(verbose=cmdline['--verbose'], prog_name=True)
for filename in filenames:
try:
filepath = Path(filename)
orig = filepath.read_text()
new = orig.replace(target, replacement)
comment('updated' if orig != new else 'unchanged', culprit=filename)
filepath.write_text(new)
except OSError as e:
error(os_error(e))
Inform() is used to specify your preferences; comment() and error() are the
informants, they actually print the messages; and os_error() is a useful utility that converts OSError exceptions into a string that can be used as an error message.
If you were to run this, you might get the following output:
> replace -v tiger toe eeny meeny miny moe
eeny: updated
meeny: unchanged
replace error: miny: no such file or directory.
replace error: moe: no such file or directory.
Hopefully this gives you an idea of what Inform does. There is a lot more power there. For example, it provides a collection of utilities that are useful when printing messages. An example is os_error(), but there are others. You can also define your own informants, which is a way of handling multiple levels of verbosity.

import logging
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s %(levelname)s %(message)s')
level specified above controls the verbosity of the output.
You can attach handlers (this is where the complexity outweighs the benefit in my case) to the logging to send output to different places (https://docs.python.org/2/howto/logging-cookbook.html#multiple-handlers-and-formatters) but I haven't needed more than command line output to date.
To produce output you must specify it's verbosity as you log it:
logging.debug("This debug message will rarely appeal to end users")
I hadn't read your very last line, the answer seemed obvious by then and I wouldn't have imagined that single basicConfig line could be described as "overly complicated". It's all I use the 60% of the time when print is not enough.

Related

Ansible - Testing if called python script is indeed running via Ansible

I'm looking for a way to test, in my python script, if said script is running from Ansible so I can also run it through shell (for running unit tests etc). Calling AnsibleModule without calling from an ansible playbook will just endlessly wait for a response that will never come.
I'm expecting that there isn't a simple test and that I have to restructure in some way, but I'm open to any options.
def main():
# must test if running via ansible before next line
module = AnsibleModule(
argument_spec=dict(
server=dict(required=True, type='str'),
[...]
)
[... do things ...]
)
if __name__ == "__main__":
if running_via_ansible:
main()
else:
run_tests()
I believe there are a couple of answers, with various levels of trickery involved
since your module is written in python, ansible will use the AnsiballZ framework to run it, which means its sys.argv[0] will start with AnsiballZ_; it will also likely be written to $HOME/.ansible/tmp on the target machine, so one could sniff for .ansible/tmp showing up in argv[0] also
if the file contains the string WANT_JSON in it, then ansible will invoke it with the module's JSON payload as the first argument instead of feeding it JSON on sys.stdin (thus far the filename has been colocated with the AnsiballZ_ script, but I don't know that such a thing is guaranteed)
Similar, although apparently far more python specific: if it contains a triple-quoted sentinel """<<INCLUDE_ANSIBLE_MODULE_JSON_ARGS>>""" (or the ''' flavor works, too) then that magic string is replaced by the serialized JSON that, again, would have been otherwise provided via stdin
While this may not apply, or be helpful, I actually would expect that any local testing environment would have more "fingerprints" than trying to detect the opposite, and has the pleasing side-effect of "failing open" in that the module will assume it is running in production mode unless it can prove testing mode, which should make for less weird false positives. Then again, I guess the reasonable default depends on how problematic it would be for the module to attempt to carry out its payload when not really in use

Am I using "warnings" module right?

I am using this to issue warnings while parsing a configuration file. All sorts of errors could happen while doing this - some fatal, some not. All those non-fatal errors should not interrupt the parsing, but they must not escape user's attention either. This is where the warnings module comes in.
I am currently doing this (pseudo code):
while parsing:
try:
get dictionary["token"]
except KeyError:
warnings.warn("Looks like your config file don't have that token")
This all looks readable and cozy, but the message looks something like this:
C:\Users\Renae\Documents\test.py:3: UserWarning: Looks like your config file don't have that token
warnings.warn("Looks like your config file don't have that token")
Why is it printed twice? Should I be doing some sort of initialization before issuing warnings (like the logging module)? The standard docs doesn't have a tutorial on this (or does it?).
What differentiates warnings from print(), stdout or stderr?
When you are using warnings module, the second print is actually the stack, you can control what level you want to print using the stacklevel argument. Example -
import warnings
def warn():
warnings.warn("Blah",stacklevel=2)
warn()
This results -
a.py:6: UserWarning: Blah
warn()
If you set it to a non-existent level , in the above example lets say 3 , then it does not print the stack, Example -
def warn():
warnings.warn("Blah",stacklevel=3)
Result -
sys:1: UserWarning: Blah
Though as you can see, the file also changed to sys:1 . You might want to show a meaning stack over there (maybe something like stacklevel=2 for the caller of the function in which the warning was raised) .
Another way to suppress this would be to use warnings.warn_explicit() method and manually pass in the filename and linenumber (linenumber should not have any actual code in it, otherwise that code would be printed), though I do not advice this.
Also, yes when using warnings module, the data normally goes into sys.stderr , but you can also easily send warnings to a different file by using different functions like - warnings.showwarning()

Allow argument/option to override positional argument

I am trying to make exclusion in my argparse parser. Basically what I want is to avoid --all option and filenames argument to be parsed (which I think succeeded).
But I want to create also another check where if I only pass python reader.py read --all, the filenames argument will get populated with all txt files in current directory.
So far I've come up with following code:
import argparse
import glob
parser = argparse.ArgumentParser()
subcommands = parser.add_subparsers(title='subcommands')
read_command = subcommands.add_parser('read')
read_command.add_argument('filenames', type=argparse.FileType(), nargs = '+')
read_command.add_argument('-a', '--all', action='store_true')
parsed = parser.parse_args()
if parsed.all and parsed.filenames:
raise SystemExit
if parsed.all:
parsed.filenames = glob.glob('*.txt')
print parsed
The problem is that if I try to run python reader.py read --all I get error error: too few arguments because of the filenames argument.
Is there a way to have this work like I want to without creating subcommand to read, for example python reader.py read all?
How can I access error messages in argparse? I'd like to have some default message that would say that filenames and --all can't be combined instead of SystemExit error.
Also I want to avoid using add_mutually_exclusive_group because this is just a snippet of my real parser where this approach wouldn't work (already checked in other SO topic).
I've heard about custom actions but it would really help to see example on it.
If filenames gets nargs="*", it should allow you to use --all alone. parsed.filenames will then be a [], which you can replace with the glob.
You could also test giving that argument a default derived from the glob - but see my caution regarding FileType.
Do you want the parser to open all the filenames you give it? Or would you rather open the files latter yourself (preferably in a with context). FileType opens the files (creating if necessary), and in the process checks their existence (which is nice), but leaves it up to you (or the program exit) to close them.
The documentation talks about issuing error messages yourself, and how to change them. parser.error('my message') with display the usage and message, and then exit.
if parsed.all and parsed.filenames:
parsed.error("Do you want to read ALL or just %s?"%parsed.filenames)
It is also possible trap SystemExit exceptions in a try/except clause.

Multi-line logging in Python

I'm using Python 3.3.5 and the logging module to log information to a local file (from different threads). There are cases where I'd like to output some additional information, without knowing exactly what that information will be (e.g. it might be one single line of text or a dict).
What I'd like to do is add this additional information to my log file, after the log record has been written. Furthermore, the additional info is only necessary when the log level is error (or higher).
Ideally, it would look something like:
2014-04-08 12:24:01 - INFO - CPU load not exceeded
2014-04-08 12:24:26 - INFO - Service is running
2014-04-08 12:24:34 - ERROR - Could not find any active server processes
Additional information, might be several lines.
Dict structured information would be written as follows:
key1=value1
key2=value2
2014-04-08 12:25:16 - INFO - Database is responding
Short of writing a custom log formatter, I couldn't find much which would fit my requirements. I've read about filters and contexts, but again this doesn't seem like a good match.
Alternatively, I could just write to a file using the standard I/O, but most of the functionality already exists in the Logging module, and moreover it's thread-safe.
Any input would be greatly appreciated. If a custom log formatter is indeed necessary, any pointers on where to start would be fantastic.
Keeping in mind that many people consider a multi-line logging message a bad practice (understandably so, since if you have a log processor like DataDog or Splunk which are very prepared to handle single line logs, multi-line logs will be very hard to parse), you can play with the extra parameter and use a custom formatter to append stuff to the message that is going to be shown (take a look to the usage of 'extra' in the logging package documentation).
import logging
class CustomFilter(logging.Filter):
def filter(self, record):
if hasattr(record, 'dct') and len(record.dct) > 0:
for k, v in record.dct.iteritems():
record.msg = record.msg + '\n\t' + k + ': ' + v
return super(CustomFilter, self).filter(record)
if __name__ == "__main__":
logging.getLogger().setLevel(logging.DEBUG)
extra_logger = logging.getLogger('extra_logger')
extra_logger.setLevel(logging.INFO)
extra_logger.addFilter(CustomFilter())
logging.debug("Nothing special here... Keep walking")
extra_logger.info("This shows extra",
extra={'dct': {"foo": "bar", "baz": "loren"}})
extra_logger.debug("You shouldn't be seeing this in the output")
extra_logger.setLevel(logging.DEBUG)
extra_logger.debug("Now you should be seeing it!")
That code outputs:
DEBUG:root:Nothing special here... Keep walking
INFO:extra_logger:This shows extra
foo: bar
baz: loren
DEBUG:extra_logger:Now you should be seeing it!
I still recommend calling the super's filter function in your custom filter, mainly because that's the function that decides whether showing the message or not (for instance, if your logger's level is set to logging.INFO, and you log something using extra_logger.debug, that message shouldn't be seen, as shown in the example above)
I just add \n symbols to the output text.
i'm using a simple line splitter in my smaller applications:
for line in logmessage.splitlines():
writemessage = logtime + " - " + line + "\n"
logging.info(str(writemessage))
Note that this is not thread-safe and should probably only be used in log-volume logging applications.
However you can output to log almost anything, as it will preserve your formatting. I have used it for example to output JSON API responses formatted using: json.dumps(parsed, indent=4, sort_keys=True)
It seems that I made a small typo when defining my LogFormatter string: by accidentally escaping the newline character, I wrongly assumed that writing multi-line output to a log file was not possible.
Cheers to #Barafu for pointing this out (which is why I assigned him the correct answer).
Here's the sample code:
import logging
lf = logging.Formatter('%(levelname)-8s - %(message)s\n%(detail)s')
lh = logging.FileHandler(filename=r'c:\temp\test.log')
lh.setFormatter(lf)
log = logging.getLogger()
log.setLevel(logging.DEBUG)
log.addHandler(lh)
log.debug('test', extra={'detail': 'This is a multi-line\ncomment to test the formatter'})
The resulting output would look like this:
DEBUG - test
This is a multi-line
comment to test the formatter
Caveat:
If there is no detail information to log, and you pass an empty string, the logger will still output a newline. Thus, the remaining question is: how can we make this conditional?
One approach would be to update the logging formatter before actually logging the information, as described here.

Python CLI program unit testing

I am working on a python Command-Line-Interface program, and I find it boring when doing testings, for example, here is the help information of the program:
usage: pyconv [-h] [-f ENCODING] [-t ENCODING] [-o file_path] file_path
Convert text file from one encoding to another.
positional arguments:
file_path
optional arguments:
-h, --help show this help message and exit
-f ENCODING, --from ENCODING
Encoding of source file
-t ENCODING, --to ENCODING
Encoding you want
-o file_path, --output file_path
Output file path
When I made changes on the program and want to test something, I must open a terminal,
type the command(with options and arguments), type enter, and see if any error occurs
while running. If error really occurs, I must go back to the editor and check the code
from top to end, guessing where the bug positions, make small changes, write print lines,
return to the terminal, run command again...
Recursively.
So my question is, what is the best way to do testing with CLI program, can it be as easy
as unit testing with normal python scripts?
I think it's perfectly fine to test functionally on a whole-program level. It's still possible to test one aspect/option per test. This way you can be sure that the program really works as a whole. Writing unit-tests usually means that you get to execute your tests quicker and that failures are usually easier to interpret/understand. But unit-tests are typically more tied to the program structure, requiring more refactoring effort when you internally change things.
Anyway, using py.test, here is a little example for testing a latin1 to utf8 conversion for pyconv::
# content of test_pyconv.py
import pytest
# we reuse a bit of pytest's own testing machinery, this should eventually come
# from a separatedly installable pytest-cli plugin.
pytest_plugins = ["pytester"]
#pytest.fixture
def run(testdir):
def do_run(*args):
args = ["pyconv"] + list(args)
return testdir._run(*args)
return do_run
def test_pyconv_latin1_to_utf8(tmpdir, run):
input = tmpdir.join("example.txt")
content = unicode("\xc3\xa4\xc3\xb6", "latin1")
with input.open("wb") as f:
f.write(content.encode("latin1"))
output = tmpdir.join("example.txt.utf8")
result = run("-flatin1", "-tutf8", input, "-o", output)
assert result.ret == 0
with output.open("rb") as f:
newcontent = f.read()
assert content.encode("utf8") == newcontent
After installing pytest ("pip install pytest") you can run it like this::
$ py.test test_pyconv.py
=========================== test session starts ============================
platform linux2 -- Python 2.7.3 -- pytest-2.4.5dev1
collected 1 items
test_pyconv.py .
========================= 1 passed in 0.40 seconds =========================
The example reuses some internal machinery of pytest's own testing by leveraging pytest's fixture mechanism, see http://pytest.org/latest/fixture.html. If you forget about the details for a moment, you can just work from the fact that "run" and "tmpdir" are provided for helping you to prepare and run tests. If you want to play, you can try to insert a failing assert-statement or simply "assert 0" and then look at the traceback or issue "py.test --pdb" to enter a python prompt.
Start from the user interface with functional tests and work down towards unit tests. It can feel difficult, especially when you use the argparse module or the click package, which take control of the application entry point.
The cli-test-helpers Python package has examples and helper functions (context managers) for a holistic approach on writing tests for your CLI. It's a simple idea, and one that works perfectly with TDD:
Start with functional tests (to ensure your user interface definition) and
Work towards unit tests (to ensure your implementation contracts)
Functional tests
NOTE: I assume you develop code that is deployed with a setup.py file or is run as a module (-m).
Is the entrypoint script installed? (tests the configuration in your setup.py)
Can this package be run as a Python module? (i.e. without having to be installed)
Is command XYZ available? etc. Cover your entire CLI usage here!
Those tests are simplistic: They run the shell command you would enter in the terminal, e.g.
def test_entrypoint():
exit_status = os.system('foobar --help')
assert exit_status == 0
Note the trick to use a non-destructive operation (e.g. --help or --version) as we can't mock anything with this approach.
Towards unit tests
To test single aspects inside the application you will need to mimic things like command line arguments and maybe environment variables. You will also need to catch the exiting of your script to avoid the tests to fail for SystemExit exceptions.
Example with ArgvContext to mimic command line arguments:
#patch('foobar.command.baz')
def test_cli_command(mock_command):
"""Is the correct code called when invoked via the CLI?"""
with ArgvContext('foobar', 'baz'), pytest.raises(SystemExit):
foobar.cli.main()
assert mock_command.called
Note that we mock the function that we want our CLI framework (click in this example) to call, and that we catch SystemExit that the framework naturally raises. The context managers are provided by cli-test-helpers and pytest.
Unit tests
The rest is business as usual. With the above two strategies we've overcome the control a CLI framework may have taken away from us. The rest is usual unit testing. TDD-style hopefully.
Disclosure: I am the author of the cli-test-helpers Python package.
So my question is, what is the best way to do testing with CLI program, can it be as easy as unit testing with normal python scripts?
The only difference is that when you run Python module as a script, its __name__ attribute is set to '__main__'. So generally, if you intend to run your script from command line it should have following form:
import sys
# function and class definitions, etc.
# ...
def foo(arg):
pass
def main():
"""Entry point to the script"""
# Do parsing of command line arguments and other stuff here. And then
# make calls to whatever functions and classes that are defined in your
# module. For example:
foo(sys.argv[1])
if __name__ == '__main__':
main()
Now there is no difference, how you would use it: as a script or as a module. So inside your unit-testing code you can just import foo function, call it and make any assertions you want.
Maybe too little too late,
but you can always use
import os.system
result = os.system(<'Insert your command with options here'>
assert(0 == result)
In that way, you can run your program as if it was from command line, and evaluate the exit code.
(Update after I studied pytest)
You can also use capsys.
(from running pytest --fixtures)
capsys
Enable text capturing of writes to sys.stdout and sys.stderr.
The captured output is made available via ``capsys.readouterr()`` method
calls, which return a ``(out, err)`` namedtuple.
``out`` and ``err`` will be ``text`` objects.
This isn't for Python specifically, but what I do to test command-line scripts is to run them with various predetermined inputs and options and store the correct output in a file. Then, to test them when I make changes, I simply run the new script and pipe the output into diff correct_output -. If the files are the same, it outputs nothing. If they're different, it shows you where. This will only work if you are on Linux or OS X; on Windows, you will have to get MSYS.
Example:
python mycliprogram --someoption "some input" | diff correct_output -
To make it even easier, you can add all these test runs to your 'make test' Makefile target, which I assume you already have. ;)
If you are running many of these at once, you could make it a little more obvious where each one ends by adding a fail tag:
python mycliprogram --someoption "some input" | diff correct_output - || tput setaf 1 && echo "FAILED"
The short answer is yes, you can use unit tests, and should. If your code is well structured, it should be quite easy to test each component separately, and if you need to to can always mock sys.argv to simulate running it with different arguments.
pytest-console-scripts is a Pytest plugin for testing python scripts installed via console_scripts entry point of setup.py.
For Python 3.5+, you can use the simpler subprocess.run to call your CLI command from your test.
Using pytest:
import subprocess
def test_command__works_properly():
try:
result = subprocess.run(['command', '--argument', 'value'], check=True, capture_output=True, text=True)
except subprocess.CalledProcessError as error:
print(error.stdout)
print(error.stderr)
raise error
The output can be accessed via result.stdout, result.stderr, and result.returncode if needed.
The check parameter causes an exception to be raised if an error occurs. Note Python 3.7+ is required for the capture_output and text parameters, which simplify capturing and reading stdout/stderr.
Given that you are explicitly asking about testing for a command line application, I believe that you are aware of unit-testing tools in python and that you are actually looking for a tool to automate end-to-end tests of a command line tool. There are a couple of tools out there that are specifically designed for that. If you are looking for something that's pip-installable, I would recommend cram. It integrates well with the rest of the python environment (e.g. through a pytest extension) and it's quite easy to use:
Simply write the commands you want to run prepended with $ and the expected output prepended with . For example, the following would be a valid cram test:
$ echo Hello
Hello
By having four spaces in front of expected output and two in front of the test, you can actually use these tests to also write documentation. More on that on the website.
You can use standard unittest module:
# python -m unittest <test module>
or use nose as a testing framework. Just write classic unittest files in separate directory and run:
# nosetests <test modules directory>
Writing unittests is easy. Just follow online manual for unittesting
I would not test the program as a whole this is not a good test strategy and may not actually catch the actual spot of the error. The CLI interface is just front end to an API. You test the API via your unit tests and then when you make a change to a specific part you have a test case to exercise that change.
So, restructure your application so that you test the API and not the application it self. But, you can have a functional test that actually does run the full application and checks that the output is correct.
In short, yes testing the code is the same as testing any other code, but you must test the individual parts rather than their combination as a whole to ensure that your changes do not break everything.

Categories