Setting up advanced Python logging - python

I'd like to use logging for my modules, but I'm not sure how to design the following requirements:
normal logging levels (info, error, warning, debug) but also some additional more verbose debug levels
logging messages can have different types; some are meant for the developer, some are meant for the user; those types go to different outputs
errors should go to stderr
I also need to keep track which module/function/code line wrote a debug message so that I can activate or deactivate individual debug messages in a configuration
I need to keep track if errors occured at all to eventually execute a sys.exit() at the end of the program
all messages should go to stdout until the loggers are set up
I've read the logging documentation, but I'm not sure what's the most streamline way to use the logging module with the requirements above (how to use concept of Logger, Handler, Filter, ...). Can you point out an idea to set this up? (e.g. write module with two loggers 'user', 'developer'; derive from Logger; do getLogger(__name__); keep error flag like this,... etc.)

1) Adding more verbose debug levels.
Have you thought this through?
Take a look about what the doc says:
Defining your own levels is possible, but should not be necessary, as the existing levels have been chosen on the basis of practical experience. However, if you are convinced that you need custom levels, great care should be exercised when doing this, and it is possibly a very bad idea to define custom levels if you are developing a library. That's because if multiple library authors all define their own custom levels, there is a chance that the logging output from such multiple libraries used together will be difficult for the using developer to control and/or interpret, because a given numeric value might mean different things for different libraries.
Take also a look at When to use logging, there are two very good tables explaining when to use what.
Anyway, if you think you'll need those extra logging levels, take a look at: logging.addLevelName().
2) Some logging messages for the developer, and some for the user
Use different loggers family with different handlers. At the base of each family set Logger.propagate to False.
3) Errors should go to stderr
This already happen by default with StreamHandler:
class logging.StreamHandler(stream=None)
Returns a new instance of the StreamHandler class. If stream is specified, the instance will use it for logging output; otherwise, sys.stderr will be used.
4) Keep track of the source of a log message
Get Loggers with different names, and in your Formatter use format strings with %(name)s.
5) All messages should go to stdout until the loggers are set up
The setup of your logging system should be one of the very first things to do, so I don't really see what this means. If you need to send messages to stdout use print as it should be and already explained in When to use logging.
Last advice: carefully read the Logging Cookbook as it covers pretty well what you need.
From the comment: How would I design to have output to different sources and also filter my module?
I wouldn't filter in the first place, filers are hard to maintain and if they are all in one place that place will have to hold too much information. Every module should get abd set its own Logger (with its own handlers or filters) using or not its parent setting.
Very quick example:
# at the very beginning
root = logging.getLogger()
fallback_handler = logging.StreamHandler(stream=sys.stdout)
root.addHandler(fallback_handler)
# first.py
first_logger = logging.getLogger('first')
first_logger.parent = False
# ... set 'first' logger as you wish
class Foo:
def __init__(self):
self.logger = logging.getLogger('first.Foo')
def baz(self):
self.logger.info("I'm in baz")
# second.py
second_logger = logging.getLogger('first.second') # to use the same settings
# third.py
abstract_logger = logging.getLogger('abs')
abstract_logger.parent = False
# ... set 'abs' logger
third_logger = logging.getLogger('abs.third')
# ... set 'abs.third' particular settings
# fourth.py
fourth_logger = logging.getLogger('abs.fourth')
# [...]

Related

Python context management and verbosity

To facilitate testing code as I write it, I include verbosity in almost every module I write, as follows:
class MyObj(object):
def __init__(arg0, kwarg0="default", verbosity=0):
self.a0 = arg0
self.k0 = kwarg0
self.vb = verbosity
def my_method(self):
if verbosity > 2:
print(f"{self} is doing a thing now...")
or
def my_func(arg0, arg1, verbosity=0):
if verbosity > 2:
print(f"doing something to {arg0} and {arg1}...")
if verbosity > 5: # Added on later edit
import ipdb;ipdb.set_trace() # to clarify requirement
do_somthing()
The executable scripts that import these will have collected (from the command line or elsewhere) a verbosity argument which gets passed all the way down the stack.
It's occurred to me to use a context manager so that I wouldn't have to initialize this variable at every level of the stack, something like having this in the driver script:
with args.verbosity as vb:
my_func("x", "y")
Can I do that and then use vb in my_func without having to include it in the signature? Is there a better way to achieve this kind of control?
SUBSEQUENT EDIT: it's clear from the first answers—-thank you for those--that I need to check out the logging module, but in some cases I want to stop execution in the middle to inspect things at a particular stack level (see the ipdb code I am adding with this edit). Would you still recommend that I use logging? (I'm assuming there's a way to get the logging level if I felt compelled to occasionally litter my code with if statements like that one.)
Finally, I'm still interested in whether the context management solution would be expected to work (even if it's not the optimal solution).
To facilitate testing code as I write it, I include verbosity in almost every module I write ...
Don't litter your code with if-statements and prints for this kind of purpose. It makes the code messy, repetitive and less readable.
The use-case is exactly what stdlib logging is for: you can unconditionally log events which describe what the program is doing, at various verbosity levels, and the messages will be displayed - or not - depending on the logging system's configuration.
import logging
log = logging.getLogger(__name__)
def my_func(arg0, arg1):
log.info("doing something to %s and %s...", arg0, arg1)
do_something()
if __name__ == "__main__":
logging.basicConfig(level=logging.DEBUG, format="%(message)s")
my_func(123, 456)
In the example above, the message will print because it is at level INFO which is above the verbosity level that I've configured logging with (DEBUG). If you configure logging at level WARNING, then it won't display.
Generally the user will control the logging configuration settings (levels, formats, streams, files) via a config file, environment variables, or command-line arguments. It is up to the end-user to choose the specific logging configuration that meets their needs, as the developer you can just log events anytime. No need to worry about where the log events end up going to, or if they end up going anywhere at all.
Another way to do this is levels of logging. For example, Python's builtin logging module has error, warning, info, and debug levels:
import logging
logger = logging.getLogger()
logger.info('Normal message')
logger.debug('Message that only gets printed with high verbosity`)
Simply configure the logging level to debug, warn, etc., and you're basically done! Plus you get lots of native logging goodies.

Grouping '.format()'-style logging messages in Sentry

I use SentryHandler from raven.handlers.logging to track any logs of higher level within Sentry. My logging messages are dynamically populated with custom content with .format(), so the text message itself doesn't necessarily always have the same content. For example:
import logging
from raven.handlers.logging import SentryHandler
from raven.conf import setup_logging
# Create a "basic" logger
logger = logging.getLogger("root")
# Create a Sentry logger handler
sh = SentryHandler("https://******#sentry.io/******")
sh.setLevel(logging.WARNING)
setup_logging(sh)
# Send the desired message to Sentry via logger
if SomeInteresetingWarning():
logger.warning("{} missing files in {} directiories!".format(num_files,num_dirs))
All good, only this causes every unique message to be considered as a unique Warning, which of course isn't true.
There is a nice QA covering this very problem on GitHub, but the solution provided there only applies to the strings, formatted with old-fashioned %s-style.
Does anybody know how to implement proper Sentry message grouping (aggregating) without having to redesign string formatting from format() back to %s placeholders?
Now you can:
https://docs.python.org/3/howto/logging-cookbook.html#use-of-alternative-formatting-styles
The gist is to use a formatter with style="{".

how to replace print debug message with the logging module

Up to now, I've been peppering my code with 'print debug message' and even 'if condition: print debug message'. But a number of people have told me that's not the best way to do it, and I really should learn how to use the logging module. After a quick read, it looks as though it does everything I could possibly want, and then some. It looks like a learning project in its own right, and I want to work on other projects now and simply use the minimum functionality to help me. If it makes any difference, I am on python 2.6 and will be for the forseeable future, due to library and legacy compatibilities.
All I want to do at the moment is pepper my code with messages that I can turn on and off section by section, as I manage to debug specific regions. As a 'hello_log_world', I tried this, and it doesn't do what I expected
import logging
# logging.basicConfig(level=logging.DEBUG)
logging.error('first error')
logging.debug('first debug')
logging.basicConfig(level=logging.DEBUG)
logging.error('second error')
logging.debug('second debug')
You'll notice I'm using the really basic config, using as many defaults as possible, to keep things simple. But appears that it's too simple, or that I don't understand the programming model behind logging.
I had expected that sys.stderr would end up with
ERROR:root:first error
ERROR:root:second error
DEBUG:root:second debug
... but only the two error messages appear. Setting level=DEBUG doesn't make the second one appear. If I uncomment the basicConfig call at the start of the program, all four get output.
Am I trying to run it at too simple a level?
What's the simplest thing I can add to what I've written there to get my expected behaviour?
Logging actually follows a particular hierarchy (DEBUG -> INFO -> WARNING -> ERROR -> CRITICAL), and the default level is WARNING. Therefore the reason you see the two ERROR messages is because it is ahead of WARNING on the hierarchy chain.
As for the odd commenting behavior, the explanation is found in the logging docs (which as you say are a task unto themselves :) ):
The call to basicConfig() should come before any calls to debug(),
info() etc. As it’s intended as a one-off simple configuration
facility, only the first call will actually do anything: subsequent
calls are effectively no-ops.
However you can use the setLevel parameter to get what you desire:
import logging
logging.getLogger().setLevel(logging.ERROR)
logging.error('first error')
logging.debug('first debug')
logging.getLogger().setLevel(logging.DEBUG)
logging.error('second error')
logging.debug('second debug')
The lack of an argument to getLogger() means that the root logger is modified. This is essentially one step before #del's (good) answer, where you start getting into multiple loggers, each with their own specific properties/output levels/etc.
Rather than modifying the logging levels in your code to control the output, you should consider creating multiple loggers, and setting the logging level for each one individually. For example:
import logging
first_logger = logging.getLogger('first')
second_logger = logging.getLogger('second')
logging.basicConfig()
first_logger.setLevel(logging.ERROR)
second_logger.setLevel(logging.DEBUG)
first_logger.error('first error')
first_logger.debug('first debug')
second_logger.error('second error')
second_logger.debug('second debug')
This outputs:
ERROR:first:first error
ERROR:second:second error
DEBUG:second:second debug

Redefining logging root logger

At my current project there are thousand of code lines which looks like this:
logging.info("bla-bla-bla")
I don't want to change all these lines, but I would change log behavior. My idea is changing root logger to other Experimental logger, which is configured by ini-file:
[loggers]
keys = Experimental
[formatter_detailed]
format = %(asctime)s:%(name)s:%(levelname)s %(module)s:%(lineno)d: %(message)s
[handler_logfile]
class = FileHandler
args = ('experimental.log', 'a')
formatter = detailed
[logger_Experimental]
level = DEBUG
qualname = Experimental
handlers = logfile
propagate = 0
Now setting the new root logger is done by this piece of code:
logging.config.fileConfig(path_to_logger_config)
logging.root = logging.getLogger('Experimental')
Is redefining of root logger safe? Maybe there is more convenient way?
I've tried to use google and looked through stackoverflow questions, but I didn't find the answer.
You're advised not to redefine the root logger in the way you describe. In general you should only use the root logger directly for small scripts - for larger applications, best practice is to use
logger = logging.getLogger(__name__)
in each module where you use logging, and then make calls to logger.info() etc.
If all you want to do is to log to a file, why not just add a file handler to the root logger? You can do using e.g.
if __name__ == '__main__':
logging.basicConfig(filename='experimental.log', filemode='w')
main() # or whatever your main entry point is called
or via a configuration file.
Update: When I say "you're advised", I mean by me, here in this answer ;-) While you may not run into any problems in your scenario, it's not good practice to overwrite a module attribute which hasn't been designed to be overwritten. For example, the root logger is an instance of a different class (which is not part of the public API), and there are other references to it in the logging machinery which would still point to the old value. Either of these facts could lead to hard-to-debug problems. Since the logging package allows a number of ways of achieving what you seem to want (seemingly, logging to a file rather than the console), then you should use those mechanisms that have been provided.
logger = logging.getLogger()
Leaving the name empty will return you the root logger.
logger = logging.getLogger('name')
Gives you another logger.

Python logging before you run logging.basicConfig?

It appears that if you invoke logging.info() BEFORE you run logging.basicConfig, the logging.basicConfig call doesn't have any effect. In fact, no logging occurs.
Where is this behavior documented? I don't really understand.
You can remove the default handlers and reconfigure logging like this:
# if someone tried to log something before basicConfig is called, Python creates a default handler that
# goes to the console and will ignore further basicConfig calls. Remove the handler if there is one.
root = logging.getLogger()
if root.handlers:
for handler in root.handlers:
root.removeHandler(handler)
logging.basicConfig(format='%(asctime)s %(message)s',level=logging.DEBUG)
Yes.
You've asked to log something. Logging must, therefore, fabricate a default configuration. Once logging is configured... well... it's configured.
"With the logger object configured,
the following methods create log
messages:"
Further, you can read about creating handlers to prevent spurious logging. But that's more a hack for bad implementation than a useful technique.
There's a trick to this.
No module can do anything except logging.getlogger() requests at a global level.
Only the if __name__ == "__main__": can do a logging configuration.
If you do logging at a global level in a module, then you may force logging to fabricate it's default configuration.
Don't do logging.info globally in any module. If you absolutely think that you must have logging.info at a global level in a module, then you have to configure logging before doing imports. This leads to unpleasant-looking scripts.
This answer from Carlos A. Ibarra is in principle right, however that implementation might break since you are iterating over a list that might be changed by calling removeHandler(). This is unsafe.
Two alternatives are:
while len(logging.root.handlers) > 0:
logging.root.removeHandler(logging.root.handlers[-1])
logging.basicConfig(format='%(asctime)s %(message)s',level=logging.DEBUG)
or:
logging.root.handlers = []
logging.basicConfig(format='%(asctime)s %(message)s',level=logging.DEBUG)
where the first of these two using the loop is the safest (since any destruction code for the handler can be called explicitly inside the logging framework). Still, this is a hack, since we rely on logging.root.handlers to be a list.
Here's the one piece of the puzzle that the above answers didn't mention... and then it will all make sense: the "root" logger -- which is used if you call, say, logging.info() before logging.basicConfig(level=logging.DEBUG) -- has a default logging level of WARNING.
That's why logging.info() and logging.debug() don't do anything: because you've configured them not to, by... um... not configuring them.
Possibly related (this one bit me): when NOT calling basicConfig, I didn't seem to be getting my debug messages, even though I set my handlers to DEBUG level. After a bit of hair-pulling, I found you have to set the level of the custom logger to be DEBUG as well. If your logger is set to WARNING, then setting a handler to DEBUG (by itself) won't get you any output on logger.info() and logger.debug().
Ran into this same issue today and, as an alternative to the answers above, here's my solution.
import logging
import sys
logging.debug('foo') # IRL, this call is from an imported module
if __name__ == '__main__':
logging.basicConfig(level=logging.INFO, force=True)
logging.info('bar') # without force=True, this is not printed to the console
Here's what the docs say about the force argument.
If this keyword argument is specified as true, any existing handlers
attached to the root logger are removed and closed, before carrying
out the configuration as specified by the other arguments.
A cleaner version of the answer given by #paul-kremer is:
while len(logging.root.handlers):
logging.root.removeHandler(logging.root.handlers[-1])
Note: it is generally safe to assume logging.root.handlers will always be a list (see: https://github.com/python/cpython/blob/cebe9ee988837b292f2c571e194ed11e7cd4abbb/Lib/logging/init.py#L1253)
Here is what I did.
I wanted to log to a file which has a name configured in a config-file and also get the debug-logs of the config-parsing.
TL;DR; This logs into a buffer until everything to configure the logger is available
# Log everything into a MemoryHandler until the real logger is ready.
# The MemoryHandler never flushes (flushLevel 100 is above CRITICAL) automatically but only on close.
# If the configuration was loaded successfully, the real logger is configured and set as target of the MemoryHandler
# before it gets flushed by closing.
# This means, that if the log gets to stdout, it is unfiltered by level
root_logger = logging.getLogger()
root_logger.setLevel(logging.NOTSET)
stdout_logging_handler = logging.StreamHandler(sys.stderr)
tmp_logging_handler = logging.handlers.MemoryHandler(1024 * 1024, 100, stdout_logging_handler)
root_logger.addHandler(tmp_logging_handler)
config: ApplicationConfig = ApplicationConfig.from_filename('config.ini')
# because the records are already logged, unwanted ones need to be removed
filtered_buffer = filter(lambda record: record.levelno >= config.main_config.log_level, tmp_logging_handler.buffer)
tmp_logging_handler.buffer = filtered_buffer
root_logger.removeHandler(tmp_logging_handler)
logging.basicConfig(filename=config.main_config.log_filename, level=config.main_config.log_level, filemode='wt')
logging_handler = root_logger.handlers[0]
tmp_logging_handler.setTarget(logging_handler)
tmp_logging_handler.close()
stdout_logging_handler.close()

Categories