Multi-line logging in Python - python

I'm using Python 3.3.5 and the logging module to log information to a local file (from different threads). There are cases where I'd like to output some additional information, without knowing exactly what that information will be (e.g. it might be one single line of text or a dict).
What I'd like to do is add this additional information to my log file, after the log record has been written. Furthermore, the additional info is only necessary when the log level is error (or higher).
Ideally, it would look something like:
2014-04-08 12:24:01 - INFO - CPU load not exceeded
2014-04-08 12:24:26 - INFO - Service is running
2014-04-08 12:24:34 - ERROR - Could not find any active server processes
Additional information, might be several lines.
Dict structured information would be written as follows:
key1=value1
key2=value2
2014-04-08 12:25:16 - INFO - Database is responding
Short of writing a custom log formatter, I couldn't find much which would fit my requirements. I've read about filters and contexts, but again this doesn't seem like a good match.
Alternatively, I could just write to a file using the standard I/O, but most of the functionality already exists in the Logging module, and moreover it's thread-safe.
Any input would be greatly appreciated. If a custom log formatter is indeed necessary, any pointers on where to start would be fantastic.

Keeping in mind that many people consider a multi-line logging message a bad practice (understandably so, since if you have a log processor like DataDog or Splunk which are very prepared to handle single line logs, multi-line logs will be very hard to parse), you can play with the extra parameter and use a custom formatter to append stuff to the message that is going to be shown (take a look to the usage of 'extra' in the logging package documentation).
import logging
class CustomFilter(logging.Filter):
def filter(self, record):
if hasattr(record, 'dct') and len(record.dct) > 0:
for k, v in record.dct.iteritems():
record.msg = record.msg + '\n\t' + k + ': ' + v
return super(CustomFilter, self).filter(record)
if __name__ == "__main__":
logging.getLogger().setLevel(logging.DEBUG)
extra_logger = logging.getLogger('extra_logger')
extra_logger.setLevel(logging.INFO)
extra_logger.addFilter(CustomFilter())
logging.debug("Nothing special here... Keep walking")
extra_logger.info("This shows extra",
extra={'dct': {"foo": "bar", "baz": "loren"}})
extra_logger.debug("You shouldn't be seeing this in the output")
extra_logger.setLevel(logging.DEBUG)
extra_logger.debug("Now you should be seeing it!")
That code outputs:
DEBUG:root:Nothing special here... Keep walking
INFO:extra_logger:This shows extra
foo: bar
baz: loren
DEBUG:extra_logger:Now you should be seeing it!
I still recommend calling the super's filter function in your custom filter, mainly because that's the function that decides whether showing the message or not (for instance, if your logger's level is set to logging.INFO, and you log something using extra_logger.debug, that message shouldn't be seen, as shown in the example above)

I just add \n symbols to the output text.

i'm using a simple line splitter in my smaller applications:
for line in logmessage.splitlines():
writemessage = logtime + " - " + line + "\n"
logging.info(str(writemessage))
Note that this is not thread-safe and should probably only be used in log-volume logging applications.
However you can output to log almost anything, as it will preserve your formatting. I have used it for example to output JSON API responses formatted using: json.dumps(parsed, indent=4, sort_keys=True)

It seems that I made a small typo when defining my LogFormatter string: by accidentally escaping the newline character, I wrongly assumed that writing multi-line output to a log file was not possible.
Cheers to #Barafu for pointing this out (which is why I assigned him the correct answer).
Here's the sample code:
import logging
lf = logging.Formatter('%(levelname)-8s - %(message)s\n%(detail)s')
lh = logging.FileHandler(filename=r'c:\temp\test.log')
lh.setFormatter(lf)
log = logging.getLogger()
log.setLevel(logging.DEBUG)
log.addHandler(lh)
log.debug('test', extra={'detail': 'This is a multi-line\ncomment to test the formatter'})
The resulting output would look like this:
DEBUG - test
This is a multi-line
comment to test the formatter
Caveat:
If there is no detail information to log, and you pass an empty string, the logger will still output a newline. Thus, the remaining question is: how can we make this conditional?
One approach would be to update the logging formatter before actually logging the information, as described here.

Related

messaging for command line programs

I tend to write a lot of command line utility programs and was wondering if
there is a standard way of messaging the user in Python. Specifically, I would like to print error and warning messages, as well as other more conversational output in a manner that is consistent with Unix conventions. I could produce these myself using the built-in print function, but the messages have a uniform structure so it seems like it would be useful to have a package to handle this for me.
For example, for commands that you run directly in the command line you might
get messages like this:
This is normal output.
error: no files given.
error: parse.c: no such file or directory.
error: parse.c:7:16: syntax error.
warning: /usr/lib64/python2.7/site-packages/simplejson:
not found, skipping.
If the commands might be run in a script or pipeline, they should include their name:
grep: /usr/dict/words: no such file or directory.
It would be nice if could handle levels of verbosity.
These things are all relatively simple in concept, but can result in a lot of
extra conditionals and complexity for each print statement.
I have looked at the logging facility in Python, but it seems overly complicated and more suited for daemons than command line utilities.
I can recommend Inform. It is the only package I have seen that seems to address this need. It provides a variety of print functions that print in different circumstances or with different headers. For example:
log() -- prints to log file, no header
comment() -- prints if verbose, no header
display() -- prints if not quiet, no header
output() -- always prints, no header
warning() -- always prints with warning header
error() -- always prints with error header
fatal() -- always prints with error header, terminates program.
Inform refers to these functions as 'informants'. Informants are very similar to the Python print function in that they take any number of arguments and builds the message by joining them together. It also allows you to specify a culprit, which is added to the front of the message.
For example, here is a simple search and replace program written using Inform.
#!/usr/bin/env python3
"""
Replace a string in one or more files.
Usage:
replace [options] <target> <replacement> <file>...
Options:
-v, --verbose indicate whether file is changed
"""
from docopt import docopt
from inform import Inform, comment, error, os_error
from pathlib import Path
# read command line
cmdline = docopt(__doc__)
target = cmdline['<target>']
replacement = cmdline['<replacement>']
filenames = cmdline['<file>']
Inform(verbose=cmdline['--verbose'], prog_name=True)
for filename in filenames:
try:
filepath = Path(filename)
orig = filepath.read_text()
new = orig.replace(target, replacement)
comment('updated' if orig != new else 'unchanged', culprit=filename)
filepath.write_text(new)
except OSError as e:
error(os_error(e))
Inform() is used to specify your preferences; comment() and error() are the
informants, they actually print the messages; and os_error() is a useful utility that converts OSError exceptions into a string that can be used as an error message.
If you were to run this, you might get the following output:
> replace -v tiger toe eeny meeny miny moe
eeny: updated
meeny: unchanged
replace error: miny: no such file or directory.
replace error: moe: no such file or directory.
Hopefully this gives you an idea of what Inform does. There is a lot more power there. For example, it provides a collection of utilities that are useful when printing messages. An example is os_error(), but there are others. You can also define your own informants, which is a way of handling multiple levels of verbosity.
import logging
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s %(levelname)s %(message)s')
level specified above controls the verbosity of the output.
You can attach handlers (this is where the complexity outweighs the benefit in my case) to the logging to send output to different places (https://docs.python.org/2/howto/logging-cookbook.html#multiple-handlers-and-formatters) but I haven't needed more than command line output to date.
To produce output you must specify it's verbosity as you log it:
logging.debug("This debug message will rarely appeal to end users")
I hadn't read your very last line, the answer seemed obvious by then and I wouldn't have imagined that single basicConfig line could be described as "overly complicated". It's all I use the 60% of the time when print is not enough.

Python logging formatting by level

I'm using python's logging library, but I want the debug logs to have a different format than the warning and error logs. Is this possible?
ETA: I want warnings and errors to appear as:
%(levelname)s: %(message)s
but debug statements to appear as
DEBUG: (only Brian cares about this) : %(message)s
all other questions I've seen have been to change the format, but that changes for EVERYTHING.
First of all, double-check if you really need this. A log output with different record formats is prone to be rather hard to read by both humans and machines.
Maybe what you actually need is different formats for different log destinations (console vs file) which will also have different verbosity (the file will have a debug log with additional information).
Now, the way is to use a custom Formatter:
class MultiformatFormatter(logging.Formatter):
def __init__(self,<args>):
<...>
def format(self,record):
if record.levelno <= logging.DEBUG:
s=<generate string one way>
else:
s=<generate string another way>
return s
<...>
#for each handler that this should apply to
handler.setFormatter(MultiformatFormatter(<args>))

Am I using "warnings" module right?

I am using this to issue warnings while parsing a configuration file. All sorts of errors could happen while doing this - some fatal, some not. All those non-fatal errors should not interrupt the parsing, but they must not escape user's attention either. This is where the warnings module comes in.
I am currently doing this (pseudo code):
while parsing:
try:
get dictionary["token"]
except KeyError:
warnings.warn("Looks like your config file don't have that token")
This all looks readable and cozy, but the message looks something like this:
C:\Users\Renae\Documents\test.py:3: UserWarning: Looks like your config file don't have that token
warnings.warn("Looks like your config file don't have that token")
Why is it printed twice? Should I be doing some sort of initialization before issuing warnings (like the logging module)? The standard docs doesn't have a tutorial on this (or does it?).
What differentiates warnings from print(), stdout or stderr?
When you are using warnings module, the second print is actually the stack, you can control what level you want to print using the stacklevel argument. Example -
import warnings
def warn():
warnings.warn("Blah",stacklevel=2)
warn()
This results -
a.py:6: UserWarning: Blah
warn()
If you set it to a non-existent level , in the above example lets say 3 , then it does not print the stack, Example -
def warn():
warnings.warn("Blah",stacklevel=3)
Result -
sys:1: UserWarning: Blah
Though as you can see, the file also changed to sys:1 . You might want to show a meaning stack over there (maybe something like stacklevel=2 for the caller of the function in which the warning was raised) .
Another way to suppress this would be to use warnings.warn_explicit() method and manually pass in the filename and linenumber (linenumber should not have any actual code in it, otherwise that code would be printed), though I do not advice this.
Also, yes when using warnings module, the data normally goes into sys.stderr , but you can also easily send warnings to a different file by using different functions like - warnings.showwarning()

Switching off debug prints

Sometimes I have a lot of prints scattered around function to print debug output.
To switch this debug outputs I came up with this:
def f(debug=False):
print = __builtins__.print if debug else lambda *p: None
Or if I need to print something apart from debug message, I create dprint function for debug messages.
The problem is, when debug=False, this print statements slow down the code considerably, because lambda *p: None is still called, and function invocation are known to be slow.
So, my question is: Is there any better way to efficiently disable all these debug prints for them not to affect code performance?
All the answers are regarding my not using logging module. This is a good to notice, but this doesn't answer the question how to avoid function invocations that slow down the code considerably - in my case 25 times (if it's possible (for example by tinkering with function code object to through away all the lines with print statements or somehow else)). What these answers suggest is replacing print with logging.debug, which should be even slower. And this question is about getting rid of those function calls completely.
I tried using logging instead of lambda *p: None, and no surprise, code became even slower.
Maybe someone would like to see the code where those prints caused 25 slowdown: http://ideone.com/n5PGu
And I don't have anything against logging module. I think it's a good practice to always stick to robust solutions without some hacks. But I thinks there is nothing criminal if I used those hacks in 20-line one-time code snippet.
Not as a restriction, but as a suggestion, maybe it's possible to delete some lines (e.g. starting with print) from function source code and recompile it? I laid out this approach in the answer below. Though I would like to see some comments on that solution, I welcome other approaches to solving this problem.
You should use the logging module instead. See http://docs.python.org/library/logging.html
Then you can set the log level depending on your needs, and create multiple logger objects, that log about different subjects.
import logging
#set your log level
logging.basicConfig(level=logging.DEBUG)
logging.debug('This is a log message')
In your case: you could simply replace your print statement with a log statement, e.g.:
import logging
print = __builtins__.print if debug else logging.debug
now the function will only be print anything if you set the logging level to debug
logging.basicConfig(level=logging.DEBUG)
But as a plus, you can use all other logging features on top! logging.error('error!')
Ned Batchelder wrote in the comment:
I suspect the slow down is in the calculation of the arguments to
your debug function. You should be looking for ways to avoid those
calculations. Preprocessing Python is just a distraction.
And he is right as slowdown is actually caused by formatting string with format method which happens regardless if the resulting string will be logged or not.
So, string formatting should be deferred and dismissed if no logging will occur. This may be achieved by refactoring dprint function or using log.debug in the following way:
log.debug('formatted message: %s', interpolated_value)
If message won't be logged, it won't be formatted, unlike print, where it's always formatted regardless of if it'll be logged or discarded.
The solution on log.debug's postponed formatting gave Martijn Pieters here.
Another solution could be to dynamically edit code of f and delete all drpint calls. But this solution is highly unrecommended to be used:
You are correct, you should never resort to this, there are so many
ways it can go wrong. First, Python is not a language designed for
source-level transformations, and it's hard to write it a transformer
such as comment_1 without gratuitously breaking valid code. Second,
this hack would break in all kinds of circumstances - for example,
when defining methods, when defining nested functions, when used in
Cython, when inspect.getsource fails for whatever reason. Python is
dynamic enough that you really don't need this kind of hack to
customize its behavior.
Here is the code of this approach, for those who like to get acquainted with it:
from __future__ import print_function
DEBUG = False
def dprint(*args,**kwargs):
'''Debug print'''
print(*args,**kwargs)
_blocked = False
def nodebug(name='dprint'):
'''Decorator to remove all functions with name 'name' being a separate expressions'''
def helper(f):
global _blocked
if _blocked:
return f
import inspect, ast, sys
source = inspect.getsource(f)
a = ast.parse(source) #get ast tree of f
class Transformer(ast.NodeTransformer):
'''Will delete all expressions containing 'name' functions at the top level'''
def visit_Expr(self, node): #visit all expressions
try:
if node.value.func.id == name: #if expression consists of function with name a
return None #delete it
except(ValueError):
pass
return node #return node unchanged
transformer = Transformer()
a_new = transformer.visit(a)
f_new_compiled = compile(a_new,'<string>','exec')
env = sys.modules[f.__module__].__dict__
_blocked = True
try:
exec(f_new_compiled,env)
finally:
_blocked = False
return env[f.__name__]
return helper
#nodebug('dprint')
def f():
dprint('f() started')
print('Important output')
dprint('f() ended')
print('Important output2')
f()
More information: Replacing parts of the function code on-the-fly
As a hack, yes, that works. (And there is no chance in hell those lambda no-ops are your app's bottleneck.)
However, you really should be doing logging properly by using the logging module.
See http://docs.python.org/howto/logging.html#logging-basic-tutorial for a basic example of how this should be done.
You definitely need to use the logging module of Python, it's very practical and you can change the log level of your application. Example:
>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)
>>> logging.debug('Test.')
DEBUG:root:Test.

Using print statements only to debug

I have been coding a lot in Python of late. And I have been working with data that I haven't worked with before, using formulae never seen before and dealing with huge files. All this made me write a lot of print statements to verify if it's all going right and identify the points of failure. But, generally, outputting so much information is not a good practice. How do I use the print statements only when I want to debug and let them be skipped when I don't want them to be printed?
The logging module has everything you could want. It may seem excessive at first, but only use the parts you need. I'd recommend using logging.basicConfig to toggle the logging level to stderr and the simple log methods, debug, info, warning, error and critical.
import logging, sys
logging.basicConfig(stream=sys.stderr, level=logging.DEBUG)
logging.debug('A debug message!')
logging.info('We processed %d records', len(processed_records))
A simple way to do this is to call a logging function:
DEBUG = True
def log(s):
if DEBUG:
print s
log("hello world")
Then you can change the value of DEBUG and run your code with or without logging.
The standard logging module has a more elaborate mechanism for this.
Use the logging built-in library module instead of printing.
You create a Logger object (say logger), and then after that, whenever you insert a debug print, you just put:
logger.debug("Some string")
You can use logger.setLevel at the start of the program to set the output level. If you set it to DEBUG, it will print all the debugs. Set it to INFO or higher and immediately all of the debugs will disappear.
You can also use it to log more serious things, at different levels (INFO, WARNING and ERROR).
First off, I will second the nomination of python's logging framework. Be a little careful about how you use it, however. Specifically: let the logging framework expand your variables, don't do it yourself. For instance, instead of:
logging.debug("datastructure: %r" % complex_dict_structure)
make sure you do:
logging.debug("datastructure: %r", complex_dict_structure)
because while they look similar, the first version incurs the repr() cost even if it's disabled. The second version avoid this. Similarly, if you roll your own, I'd suggest something like:
def debug_stdout(sfunc):
print(sfunc())
debug = debug_stdout
called via:
debug(lambda: "datastructure: %r" % complex_dict_structure)
which will, again, avoid the overhead if you disable it by doing:
def debug_noop(*args, **kwargs):
pass
debug = debug_noop
The overhead of computing those strings probably doesn't matter unless they're either 1) expensive to compute or 2) the debug statement is in the middle of, say, an n^3 loop or something. Not that I would know anything about that.
I don't know about others, but I was used to define a "global constant" (DEBUG) and then a global function (debug(msg)) that would print msg only if DEBUG == True.
Then I write my debug statements like:
debug('My value: %d' % value)
...then I pick up unit testing and never did this again! :)
A better way to debug the code is, by using module clrprint
It prints a color full output only when pass parameter debug=True
from clrprint import *
clrprint('ERROR:', information,clr=['r','y'], debug=True)

Categories