how to replace print debug message with the logging module - python

Up to now, I've been peppering my code with 'print debug message' and even 'if condition: print debug message'. But a number of people have told me that's not the best way to do it, and I really should learn how to use the logging module. After a quick read, it looks as though it does everything I could possibly want, and then some. It looks like a learning project in its own right, and I want to work on other projects now and simply use the minimum functionality to help me. If it makes any difference, I am on python 2.6 and will be for the forseeable future, due to library and legacy compatibilities.
All I want to do at the moment is pepper my code with messages that I can turn on and off section by section, as I manage to debug specific regions. As a 'hello_log_world', I tried this, and it doesn't do what I expected
import logging
# logging.basicConfig(level=logging.DEBUG)
logging.error('first error')
logging.debug('first debug')
logging.basicConfig(level=logging.DEBUG)
logging.error('second error')
logging.debug('second debug')
You'll notice I'm using the really basic config, using as many defaults as possible, to keep things simple. But appears that it's too simple, or that I don't understand the programming model behind logging.
I had expected that sys.stderr would end up with
ERROR:root:first error
ERROR:root:second error
DEBUG:root:second debug
... but only the two error messages appear. Setting level=DEBUG doesn't make the second one appear. If I uncomment the basicConfig call at the start of the program, all four get output.
Am I trying to run it at too simple a level?
What's the simplest thing I can add to what I've written there to get my expected behaviour?

Logging actually follows a particular hierarchy (DEBUG -> INFO -> WARNING -> ERROR -> CRITICAL), and the default level is WARNING. Therefore the reason you see the two ERROR messages is because it is ahead of WARNING on the hierarchy chain.
As for the odd commenting behavior, the explanation is found in the logging docs (which as you say are a task unto themselves :) ):
The call to basicConfig() should come before any calls to debug(),
info() etc. As it’s intended as a one-off simple configuration
facility, only the first call will actually do anything: subsequent
calls are effectively no-ops.
However you can use the setLevel parameter to get what you desire:
import logging
logging.getLogger().setLevel(logging.ERROR)
logging.error('first error')
logging.debug('first debug')
logging.getLogger().setLevel(logging.DEBUG)
logging.error('second error')
logging.debug('second debug')
The lack of an argument to getLogger() means that the root logger is modified. This is essentially one step before #del's (good) answer, where you start getting into multiple loggers, each with their own specific properties/output levels/etc.

Rather than modifying the logging levels in your code to control the output, you should consider creating multiple loggers, and setting the logging level for each one individually. For example:
import logging
first_logger = logging.getLogger('first')
second_logger = logging.getLogger('second')
logging.basicConfig()
first_logger.setLevel(logging.ERROR)
second_logger.setLevel(logging.DEBUG)
first_logger.error('first error')
first_logger.debug('first debug')
second_logger.error('second error')
second_logger.debug('second debug')
This outputs:
ERROR:first:first error
ERROR:second:second error
DEBUG:second:second debug

Related

Python logging perforamnce when level is higher

One advantage of using logging in Python instead of print is that you can set the level of the logging. When debugging you could set the level to DEBUG and everything below will get printed. If you set the level to ERROR then only error messages will get printed.
In a high-performance application this property is desirable. You want to be able to print some logging information during development/testing/debugging but not when you run it in production.
I want to ask if logging will be an efficient way to suppress debug and info logging when you set the level to ERROR. In other words, would doing the following:
logging.basicConfig(level=logging.ERROR)
logging.debug('something')
will be as efficient as
if not in_debug:
print('...')
Obviously the second code will not cost because checking a boolean is fast and when not in debug mode the code will be faster because it will not print unnecessary stuff. It comes to the cost of having all those if statement though. If logging delivers same performance without the if statements is of course much more desirable.
There's no need for you to check is_debug in your source code, because the Logging module already does that.
Here's an excerpt, with some comments and whitespace removed:
class Logger(Filterer):
def debug(self, msg, *args, **kwargs):
if self.isEnabledFor(DEBUG):
self._log(DEBUG, msg, args, **kwargs)
Just make sure you follow pylint guidelines on how to pass parameters to the logging function, so they don't execute stuff before calling the logging code. See PyLint message: logging-format-interpolation
I first learned about it through pylint warnings, but here's the official documentation that says to use % formatting and pass arguments to be evaluated later: https://docs.python.org/3/howto/logging.html#optimization

Python context management and verbosity

To facilitate testing code as I write it, I include verbosity in almost every module I write, as follows:
class MyObj(object):
def __init__(arg0, kwarg0="default", verbosity=0):
self.a0 = arg0
self.k0 = kwarg0
self.vb = verbosity
def my_method(self):
if verbosity > 2:
print(f"{self} is doing a thing now...")
or
def my_func(arg0, arg1, verbosity=0):
if verbosity > 2:
print(f"doing something to {arg0} and {arg1}...")
if verbosity > 5: # Added on later edit
import ipdb;ipdb.set_trace() # to clarify requirement
do_somthing()
The executable scripts that import these will have collected (from the command line or elsewhere) a verbosity argument which gets passed all the way down the stack.
It's occurred to me to use a context manager so that I wouldn't have to initialize this variable at every level of the stack, something like having this in the driver script:
with args.verbosity as vb:
my_func("x", "y")
Can I do that and then use vb in my_func without having to include it in the signature? Is there a better way to achieve this kind of control?
SUBSEQUENT EDIT: it's clear from the first answers—-thank you for those--that I need to check out the logging module, but in some cases I want to stop execution in the middle to inspect things at a particular stack level (see the ipdb code I am adding with this edit). Would you still recommend that I use logging? (I'm assuming there's a way to get the logging level if I felt compelled to occasionally litter my code with if statements like that one.)
Finally, I'm still interested in whether the context management solution would be expected to work (even if it's not the optimal solution).
To facilitate testing code as I write it, I include verbosity in almost every module I write ...
Don't litter your code with if-statements and prints for this kind of purpose. It makes the code messy, repetitive and less readable.
The use-case is exactly what stdlib logging is for: you can unconditionally log events which describe what the program is doing, at various verbosity levels, and the messages will be displayed - or not - depending on the logging system's configuration.
import logging
log = logging.getLogger(__name__)
def my_func(arg0, arg1):
log.info("doing something to %s and %s...", arg0, arg1)
do_something()
if __name__ == "__main__":
logging.basicConfig(level=logging.DEBUG, format="%(message)s")
my_func(123, 456)
In the example above, the message will print because it is at level INFO which is above the verbosity level that I've configured logging with (DEBUG). If you configure logging at level WARNING, then it won't display.
Generally the user will control the logging configuration settings (levels, formats, streams, files) via a config file, environment variables, or command-line arguments. It is up to the end-user to choose the specific logging configuration that meets their needs, as the developer you can just log events anytime. No need to worry about where the log events end up going to, or if they end up going anywhere at all.
Another way to do this is levels of logging. For example, Python's builtin logging module has error, warning, info, and debug levels:
import logging
logger = logging.getLogger()
logger.info('Normal message')
logger.debug('Message that only gets printed with high verbosity`)
Simply configure the logging level to debug, warn, etc., and you're basically done! Plus you get lots of native logging goodies.

Setting up advanced Python logging

I'd like to use logging for my modules, but I'm not sure how to design the following requirements:
normal logging levels (info, error, warning, debug) but also some additional more verbose debug levels
logging messages can have different types; some are meant for the developer, some are meant for the user; those types go to different outputs
errors should go to stderr
I also need to keep track which module/function/code line wrote a debug message so that I can activate or deactivate individual debug messages in a configuration
I need to keep track if errors occured at all to eventually execute a sys.exit() at the end of the program
all messages should go to stdout until the loggers are set up
I've read the logging documentation, but I'm not sure what's the most streamline way to use the logging module with the requirements above (how to use concept of Logger, Handler, Filter, ...). Can you point out an idea to set this up? (e.g. write module with two loggers 'user', 'developer'; derive from Logger; do getLogger(__name__); keep error flag like this,... etc.)
1) Adding more verbose debug levels.
Have you thought this through?
Take a look about what the doc says:
Defining your own levels is possible, but should not be necessary, as the existing levels have been chosen on the basis of practical experience. However, if you are convinced that you need custom levels, great care should be exercised when doing this, and it is possibly a very bad idea to define custom levels if you are developing a library. That's because if multiple library authors all define their own custom levels, there is a chance that the logging output from such multiple libraries used together will be difficult for the using developer to control and/or interpret, because a given numeric value might mean different things for different libraries.
Take also a look at When to use logging, there are two very good tables explaining when to use what.
Anyway, if you think you'll need those extra logging levels, take a look at: logging.addLevelName().
2) Some logging messages for the developer, and some for the user
Use different loggers family with different handlers. At the base of each family set Logger.propagate to False.
3) Errors should go to stderr
This already happen by default with StreamHandler:
class logging.StreamHandler(stream=None)
Returns a new instance of the StreamHandler class. If stream is specified, the instance will use it for logging output; otherwise, sys.stderr will be used.
4) Keep track of the source of a log message
Get Loggers with different names, and in your Formatter use format strings with %(name)s.
5) All messages should go to stdout until the loggers are set up
The setup of your logging system should be one of the very first things to do, so I don't really see what this means. If you need to send messages to stdout use print as it should be and already explained in When to use logging.
Last advice: carefully read the Logging Cookbook as it covers pretty well what you need.
From the comment: How would I design to have output to different sources and also filter my module?
I wouldn't filter in the first place, filers are hard to maintain and if they are all in one place that place will have to hold too much information. Every module should get abd set its own Logger (with its own handlers or filters) using or not its parent setting.
Very quick example:
# at the very beginning
root = logging.getLogger()
fallback_handler = logging.StreamHandler(stream=sys.stdout)
root.addHandler(fallback_handler)
# first.py
first_logger = logging.getLogger('first')
first_logger.parent = False
# ... set 'first' logger as you wish
class Foo:
def __init__(self):
self.logger = logging.getLogger('first.Foo')
def baz(self):
self.logger.info("I'm in baz")
# second.py
second_logger = logging.getLogger('first.second') # to use the same settings
# third.py
abstract_logger = logging.getLogger('abs')
abstract_logger.parent = False
# ... set 'abs' logger
third_logger = logging.getLogger('abs.third')
# ... set 'abs.third' particular settings
# fourth.py
fourth_logger = logging.getLogger('abs.fourth')
# [...]

Using print statements only to debug

I have been coding a lot in Python of late. And I have been working with data that I haven't worked with before, using formulae never seen before and dealing with huge files. All this made me write a lot of print statements to verify if it's all going right and identify the points of failure. But, generally, outputting so much information is not a good practice. How do I use the print statements only when I want to debug and let them be skipped when I don't want them to be printed?
The logging module has everything you could want. It may seem excessive at first, but only use the parts you need. I'd recommend using logging.basicConfig to toggle the logging level to stderr and the simple log methods, debug, info, warning, error and critical.
import logging, sys
logging.basicConfig(stream=sys.stderr, level=logging.DEBUG)
logging.debug('A debug message!')
logging.info('We processed %d records', len(processed_records))
A simple way to do this is to call a logging function:
DEBUG = True
def log(s):
if DEBUG:
print s
log("hello world")
Then you can change the value of DEBUG and run your code with or without logging.
The standard logging module has a more elaborate mechanism for this.
Use the logging built-in library module instead of printing.
You create a Logger object (say logger), and then after that, whenever you insert a debug print, you just put:
logger.debug("Some string")
You can use logger.setLevel at the start of the program to set the output level. If you set it to DEBUG, it will print all the debugs. Set it to INFO or higher and immediately all of the debugs will disappear.
You can also use it to log more serious things, at different levels (INFO, WARNING and ERROR).
First off, I will second the nomination of python's logging framework. Be a little careful about how you use it, however. Specifically: let the logging framework expand your variables, don't do it yourself. For instance, instead of:
logging.debug("datastructure: %r" % complex_dict_structure)
make sure you do:
logging.debug("datastructure: %r", complex_dict_structure)
because while they look similar, the first version incurs the repr() cost even if it's disabled. The second version avoid this. Similarly, if you roll your own, I'd suggest something like:
def debug_stdout(sfunc):
print(sfunc())
debug = debug_stdout
called via:
debug(lambda: "datastructure: %r" % complex_dict_structure)
which will, again, avoid the overhead if you disable it by doing:
def debug_noop(*args, **kwargs):
pass
debug = debug_noop
The overhead of computing those strings probably doesn't matter unless they're either 1) expensive to compute or 2) the debug statement is in the middle of, say, an n^3 loop or something. Not that I would know anything about that.
I don't know about others, but I was used to define a "global constant" (DEBUG) and then a global function (debug(msg)) that would print msg only if DEBUG == True.
Then I write my debug statements like:
debug('My value: %d' % value)
...then I pick up unit testing and never did this again! :)
A better way to debug the code is, by using module clrprint
It prints a color full output only when pass parameter debug=True
from clrprint import *
clrprint('ERROR:', information,clr=['r','y'], debug=True)

Python logging before you run logging.basicConfig?

It appears that if you invoke logging.info() BEFORE you run logging.basicConfig, the logging.basicConfig call doesn't have any effect. In fact, no logging occurs.
Where is this behavior documented? I don't really understand.
You can remove the default handlers and reconfigure logging like this:
# if someone tried to log something before basicConfig is called, Python creates a default handler that
# goes to the console and will ignore further basicConfig calls. Remove the handler if there is one.
root = logging.getLogger()
if root.handlers:
for handler in root.handlers:
root.removeHandler(handler)
logging.basicConfig(format='%(asctime)s %(message)s',level=logging.DEBUG)
Yes.
You've asked to log something. Logging must, therefore, fabricate a default configuration. Once logging is configured... well... it's configured.
"With the logger object configured,
the following methods create log
messages:"
Further, you can read about creating handlers to prevent spurious logging. But that's more a hack for bad implementation than a useful technique.
There's a trick to this.
No module can do anything except logging.getlogger() requests at a global level.
Only the if __name__ == "__main__": can do a logging configuration.
If you do logging at a global level in a module, then you may force logging to fabricate it's default configuration.
Don't do logging.info globally in any module. If you absolutely think that you must have logging.info at a global level in a module, then you have to configure logging before doing imports. This leads to unpleasant-looking scripts.
This answer from Carlos A. Ibarra is in principle right, however that implementation might break since you are iterating over a list that might be changed by calling removeHandler(). This is unsafe.
Two alternatives are:
while len(logging.root.handlers) > 0:
logging.root.removeHandler(logging.root.handlers[-1])
logging.basicConfig(format='%(asctime)s %(message)s',level=logging.DEBUG)
or:
logging.root.handlers = []
logging.basicConfig(format='%(asctime)s %(message)s',level=logging.DEBUG)
where the first of these two using the loop is the safest (since any destruction code for the handler can be called explicitly inside the logging framework). Still, this is a hack, since we rely on logging.root.handlers to be a list.
Here's the one piece of the puzzle that the above answers didn't mention... and then it will all make sense: the "root" logger -- which is used if you call, say, logging.info() before logging.basicConfig(level=logging.DEBUG) -- has a default logging level of WARNING.
That's why logging.info() and logging.debug() don't do anything: because you've configured them not to, by... um... not configuring them.
Possibly related (this one bit me): when NOT calling basicConfig, I didn't seem to be getting my debug messages, even though I set my handlers to DEBUG level. After a bit of hair-pulling, I found you have to set the level of the custom logger to be DEBUG as well. If your logger is set to WARNING, then setting a handler to DEBUG (by itself) won't get you any output on logger.info() and logger.debug().
Ran into this same issue today and, as an alternative to the answers above, here's my solution.
import logging
import sys
logging.debug('foo') # IRL, this call is from an imported module
if __name__ == '__main__':
logging.basicConfig(level=logging.INFO, force=True)
logging.info('bar') # without force=True, this is not printed to the console
Here's what the docs say about the force argument.
If this keyword argument is specified as true, any existing handlers
attached to the root logger are removed and closed, before carrying
out the configuration as specified by the other arguments.
A cleaner version of the answer given by #paul-kremer is:
while len(logging.root.handlers):
logging.root.removeHandler(logging.root.handlers[-1])
Note: it is generally safe to assume logging.root.handlers will always be a list (see: https://github.com/python/cpython/blob/cebe9ee988837b292f2c571e194ed11e7cd4abbb/Lib/logging/init.py#L1253)
Here is what I did.
I wanted to log to a file which has a name configured in a config-file and also get the debug-logs of the config-parsing.
TL;DR; This logs into a buffer until everything to configure the logger is available
# Log everything into a MemoryHandler until the real logger is ready.
# The MemoryHandler never flushes (flushLevel 100 is above CRITICAL) automatically but only on close.
# If the configuration was loaded successfully, the real logger is configured and set as target of the MemoryHandler
# before it gets flushed by closing.
# This means, that if the log gets to stdout, it is unfiltered by level
root_logger = logging.getLogger()
root_logger.setLevel(logging.NOTSET)
stdout_logging_handler = logging.StreamHandler(sys.stderr)
tmp_logging_handler = logging.handlers.MemoryHandler(1024 * 1024, 100, stdout_logging_handler)
root_logger.addHandler(tmp_logging_handler)
config: ApplicationConfig = ApplicationConfig.from_filename('config.ini')
# because the records are already logged, unwanted ones need to be removed
filtered_buffer = filter(lambda record: record.levelno >= config.main_config.log_level, tmp_logging_handler.buffer)
tmp_logging_handler.buffer = filtered_buffer
root_logger.removeHandler(tmp_logging_handler)
logging.basicConfig(filename=config.main_config.log_filename, level=config.main_config.log_level, filemode='wt')
logging_handler = root_logger.handlers[0]
tmp_logging_handler.setTarget(logging_handler)
tmp_logging_handler.close()
stdout_logging_handler.close()

Categories