proper logging implementation for Python modules

proper logging implementation for Python modules - python

Does anyone know of any good examples of modules with nice logging implementations?
I have been doing logging a couple of different ways, but I'm not sure which is most Pythonic.
For scripts, this is what I've been doing:
import logging
LOGGER = logging.getLogger(__program__)
STREAM = logging.StreamHandler()
FORMATTER = logging.Formatter('(%(name)s) %(levelname)s: %(message)s')
STREAM.setFormatter(FORMATTER)
LOGGER.addHandler(STREAM)
def main():
LOGGER.warning('This is a warning message.')
That is executed in the global namespace and I can call LOGGER from anywhere.
The aforementioned solution is not such a good idea for modules because the code is executed upon import. So for modules, I have been calling this _logging() function early on to set up the logging.
def _logging():
import logging
global logger
logger = logging.getLogger(__program__)
stream = logging.StreamHandler()
formatter = logging.Formatter('(%(name)s) %(levelname)s: %(message)s')
stream.setFormatter(formatter)
logger.addHandler(stream)
def main():
_logging()
logger.warning('This is a warning message.')
Since logger is global, I can just call it anywhere it is needed. However pylint barks out a global-variable-undefined warning. It is defined as Global variable logger undefined at the module level Used when a variable is defined through the “global” statement but the variable is not defined in the module scope but I'm not really sure why that is an issue here.
Or should I call the _logger() function early on (minus the global) and then create the logger everywhere that it is needed?
def _logging():
import logging
logger = logging.getLogger(__program__)
stream = logging.StreamHandler()
formatter = logging.Formatter('(%(name)s) %(levelname)s: %(message)s')
stream.setFormatter(formatter)
logger.addHandler(stream)
def main():
_logging()
logger = logging.getLogger(__program__)
logger.warning('This is a warning message.')
The last technique seems to be the cleanest, albeit the most tedious, especially since I am often logging from within dozens of small classes, functions, methods, et cetera. Are there any examples from people/modules who have already blazed a path through this territory?

If I understood correctly, you are configuring logging in each module separately. That would be unnecessary and against the design of logging module.
I think that the key to logging is understanding that logging module is a stateful object in you Python process. At least for me after that insight there was only one obvious way to do logging in most situations.
You should configure logging at the beginning of your program. Define handlers, formatters, etc., and the configuration will remain throughout the program, as long as it isn't explicitly overridden.
All modules that do logging can define a global logger right after logging is imported. There is no need to put this into a function. As is also recommended by the documentation, it is good practice to name each logger according to module name (including package path):
import logging
logger = logging.getLogger(__name__)
It is also important to understand that loggers in your program form a hierarchy. By default, loggers propagate records to their parents. This means that at the bottom there is one logger (root) which gets all the records unless you configure some loggers to prevent this. Often it may be enough to configure only the root logger.
To be a little more concrete, let's make a program with two modules, one.py and two.py. one.py contains a function main which will be the entry point to the program. We'll configure logging using dictConfig, which lets us separate the logging configuration nicely from the rest of the code. We'll put the configuration dict in a separate YAML file, like this:
# logging_config.yaml
version: 1
formatters:
brief:
format: '%(message)s'
default:
format: '%(asctime)s %(levelname)-8s %(name)-15s %(message)s'
datefmt: '%Y-%m-%d %H:%M:%S'
handlers:
console:
class: logging.StreamHandler
formatter: brief
stream: ext://sys.stdout
file:
class: logging.handlers.RotatingFileHandler
formatter: default
filename: example.log
maxBytes: 1024
backupCount: 3
loggers:
two:
level: INFO
handlers: [file]
propagate: False
root:
level: INFO
handlers: [console]
These snippets are mostly adapted from the documentation. In this configuration we define that everything above level INFO from logger two is logged into a file. Records from two are also not propagated further. Everything that comes to the root logger is fed to console if it is above level INFO.
The definition of module one could then be like this:
# one.py
import logging
import logging.config
import yaml
def configure_logging(filename):
with open(filename) as f:
config = yaml.load(f)
logging.config.dictConfig(config)
def main():
configure_logging('logging_config.yaml')
from two import func
logger = logging.getLogger(__name__)
logger.info('Starting the program')
func()
logger.info('Finished')
One tricky detail here is that we import module two and define the logger only after the configuration is set. This is done because by default dictConfig disables all existing loggers.
And finally, here is the definition for module two:
# two.py
import logging
logger = logging.getLogger(__name__)
def func():
logger.info('Doing stuff')
Now if we run the program, we get the following output:
>>> import one
>>> one.main()
Starting the program
Finished
And the log file example.log contains the following line:
2015-06-07 15:04:15 INFO two Doing stuff
Excellent examples of logging can be found in Python documentation:
Logging HOWTO
Logging Cookbook
Logging API reference

Related

How to prevent imported packages/modules from adding logging handlers

Context: I would like to create a single stream to stdout of log events formatted in JSON without any duplicate events. Within my own applications, this is relatively simple. I simply configure the root logger to have a single StreamHandler and use python-json-logger as the Formatter.
included_attributes = '%(asctime)s %(filename)s %(funcName)s %(lineno)d %(message)s %(levelname)s %(module)s %(name)s %(pathname)s %(process)d %(thread)s %(threadName)s'
json_formatter = jsonlogger.JsonFormatter(included_attributes)
def change_root_logger_format():
root_logger = logging.getLogger()
root_logger.setLevel(logging.DEBUG)
root_handler = logging.StreamHandler()
root_handler.setFormatter(json_formatter)
root_logger.handlers = [root_handler]
For various modules I add the standard logger = logging.getLogger(__name__).
All logs roll up the hierarchy to root with my single StreamHandler() and output to stdout as JSON.
As I understand it, best practice for distributed packages is that they don't declare their own handlers (or at least check if they are __main__ before doing so). I have found that this is not universally applied, however.
I am using at least one package that adds its own handlers. That is, the package __init__.py contains something like
logger = logging.getLogger(__name__)
logger.addHandler(StreamHandler())
As a result, I end up with duplicate logs for everything logged by this package. One line from its own handler, and one from the root logger.
The only solution I've found to this is to remove the handlers after the fact. That is, loop through logging.root.manager.loggerDict and removing the handlers of any Logger that isn't root. The problem with this approach is that any new import could cause new Handlers to appear. Thus, all of my modules would need the same boilerplate handler removal after their imports.
Does anyone know of a way to prevent this problem pre-emptively and exhuastively? That is, preventing all loggers but root from having Handlers?
I don't believe this question is a duplicate. There are many questions on disabling package/module loggers (for example, this one). These questions and answers don't apply to removing handlers while keeping the logger active.

You can disable logging from imported modules using the logging module. You can configure it to not log messages unless they are at least warnings using the following code: import logging logging.getLogger("imported_module").setLevel(logging.WARNING)

Python - Will logging happen if I don't import logging module

When debugging a python function, I set up my logging like this:
import logging
logger_name = 'test.log'
log_format = "%(funcName)s:%(lineno)d -%(message)s"
logging.basicConfig(filename=logger_name,
level=logging.DEBUG,
format = log_format,
filemode='w')
My program consists of multiple functions that are located in separate files, so I get the logger object at each file where I take functions from using the following:
import logging
logger = logging.getLogger(__name__)
However, when executing the function for finding out its execution time, I do not want to include my logs. Is it enough to just remove the set up of logging in the file where the main function is located? Or should I also remove logger = logging.getLogger(__name__) from each file, or should I do something else?

The logger main idea beside logging useful information to your console or file is to disable certains level of log easily.
To disable only the debug logging just set the level=logging.DEBUGto level=logging.INFO or level=logging.WARNING.
In that way your are keeping your log to warning when in production, and if you want to use debug, reverse it back

How do you use logging in package which by default falls back to print in Python?

I want to change my print statements of my package to using logging. So I will write my scripts like
import logging
logger = logging.getLogger(__name__)
def func():
logger.info("Calling func")
which is the recommended way?
However, many users do not initialize logging and hence will not see the output.
Is there a recommended way so that users who do not initialize logging can see info output, and those who explicitly set up logging do not get their specific configuration tampered with by my package?

As a general rule of thumb, modules should never configure logging directly (and do other unsolicited changes to the shared STDOUT/STDERR either) as that's the realm of the module user. If the user wants the output, he should explicitly enable logging and then, and only then, your module should be allowed to log.
Bonus points if you provide an interface for explicitly enabling logging within your module so that the user doesn't have to explicitly change levels / disable loggers of your inner components if they're interested only in logging from their own code.
Of course, to keep using logging when a STDOUT/STDERR handler is not yet initialized you can use logging.NullHandler so all you have to do in your module is:
import logging
logger = logging.getLogger(__name__)
logger.addHandler(logging.NullHandler()) # initialize your logger with a NullHandler
def func():
logger.info("Calling func")
func() # (((crickets)))
Now you can continue using your logger throughout your module and until the user initializes logging your module won't trespass on the output, e.g.:
import your_module
your_module.func() # (((crickets)))
import logging
logging.root.setLevel(logging.INFO)
logging.info("Initialized!") # INFO:root:Initialized!
your_module.func() # INFO:your_module:Calling func
Of course, the actual logging format, level and other settings should also be in the realm of the user so if they change the format of the root log handler, it should be inherited by your module logger as well.

Logging level randomly changing and not always writing to file

I have this piece of code to set my logger in Python:
#Configure logging
logging.basicConfig(format = '%(asctime)s %(name)-12s %(levelname)-8s %(message)s',
datefmt='%m-%d %H:%M',
filename= "log.txt",
level = logging.getLevelName('DEBUG'))
print(logging.getLogger().getEffectiveLevel())
But the output from the print statement sometimes is this:
30
And other times is this (which is correct):
10
But often even when the the logging level is set to the correct number, it is not logging anything to the file, but other times it works. What do I need to do to make sure my logging level is set correctly?
*Edit: Below is my solution based off the recommendation of #randomir.
**Edit: I had to make a second change where I set the level after I call logging.basicConfig() or else the logging level still was not getting called consistently. The line `logging.getLogger().setLevel(...) now seems to work.
I created a new class: Logger.py.
import logging
class Logger(object):
def __init__(self):
logging.basicConfig(format = '%(asctime)s %(name)-12s %(levelname)-8s %(message)s',
datefmt='%m-%d %H:%M',
filename= "log.txt")
logging.getLogger().setLevel(logging.getLevelName("DEBUG"))
print(logging.getLogger().getEffectiveLevel())
And now instead of configuring the basic config directly in my startup class I just instantiate the logger class once:
from Logger import Logger
import logging
class LaunchPython(object):
#Configure Logging
Logger()
logging.info("Application has started")
On all subsequent classes that call the logger I just put import logging on the top and then do logging.info(...) or logging.debug(....). There is no need to import the Logger.py class and reinstantiate it.

The logging.basicConfig() creates a root logger for you application (that's a logger with the name root, you can get it with logging.getLogger('root'), or logging.getLogger()).
The trick is that the root logger gets created with defaults (like level=30) on the first call to any logging function (like logging.info()) if it doesn't exist already. So, make sure you call your basicConfig() before any logging in any other part of the application.
You can do that by extracting your logger config to a separate module, like logger.py, and then import that module in each of your modules. Or, if your application has a central entry point, just do the configuration there. Note that the 3rd party functions you call will also create the root logger if it doesn't exist.
Also note, if you application is multi-threaded:
Note
This function should be called from the main thread before other threads are started. In versions of Python prior to 2.7.1 and 3.2, if this function is called from multiple threads, it is possible (in rare circumstances) that a handler will be added to the root logger more than once, leading to unexpected results such as messages being duplicated in the log.

Should i use logging module or logging class?

I write big program with many modules. In same module I wish use logging. What best practice for logging in Python?
Should I use import standart logging module and use it in every my file:
#proxy_control.py
#!/usr/bin/env python3
import logging
class ProxyClass():
def check_proxy():
pass
logging.basicConfig(filename='proxy.log', level=logging.INFO)
logging.info('Proxy work fine')
Or maybe i should write one MyLog() class inherit from default logging and use it from all my other modules?
#proxy_control.py
#!/usr/bin/env python3
import MyLog
class ProxyClass():
def check_proxy():
pass
Mylog.send_log('Proxy work fine')
#my_log.py
#!/usr/env/bin python3
import logging
class MyLog():
def send_log(value):
pass

A typical convention is to define a logger at the top of every module that requires logging and then use that logger object throughout the module.
logger = logging.getLogger(__name__)
This way, loggers will follow your package names (ie. package.subpackage.module). This is useful in the logging module because loggers propagate messages upwards based on the logger name (ie. parent.child will pass messages up to parent). This means that you can do all your configuration at the top level logger and it will get messages from all the sub-loggers in your package. Or, if someone else is using your library, it will be very easy for them to configure which logging messages they get from your library because it will have a consistent namespace.
For a library, you typically don't want to show logging messages unless the user explicitly enables them. Python logging will automatically log to stderr unless you set up a handler, so you should add a NullHandler to your top-level logger. This will prevent the messages from automatically being printed to stderr.
logger = logging.getLogger(__name__)
logger.addHandler(logging.NullHandler())
NOTE - The NullHandler was added in Python2.7, for earlier versions, you'll have to implement it yourself.

Use the logging module, and leave logging configuration to your application's entry point (modules should not configure logging by themselves).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

proper logging implementation for Python modules - python

Related

How to prevent imported packages/modules from adding logging handlers

Python - Will logging happen if I don't import logging module

How do you use logging in package which by default falls back to print in Python?

Logging level randomly changing and not always writing to file

Should i use logging module or logging class?

Categories

Resources