How to use a single Python logging object throughout the Python application? - python

Python provides the logging module. We can use the logger in place of print and use its multiple log levels. The issue here is that when we use logger, we pass the log string into the logger object. This means that the logger object must be accessible from every function/method and class in the entire Python program.
logger = logging.getLogger('mylogger')
logger.info('This is a message from mylogger.')
Now my question is, for large Python programs that are possibly split across more than 1 source file and made up of a multitude of functions/methods and classes, how do we ensure that the same logger object is used everywhere to log messages? Or do I have the wrong idea on how the logging module is used?

Or do I have the wrong idea on how the logging module is used?
Yup, wrong idea.
for large Python programs that are ... split across more than 1 source file ... how do we ensure that the same logger object is used everywhere ?
That's not a goal.
We log messages so a maintenance engineer can pick up the pieces later.
Use one logger per module.
It is a feature that a logger will reveal the source module it came from.
We use that to rapidly narrow down "what code ran?" & "with what input values?"
This assists a maintainer in reproducing and repairing observed buggy output.
The usual idiom is to begin each module with
logger = logging.getLogger(__name__)
That way, logger.error(...) and similar calls
will reveal the name of the module reporting the error,
which lets maintainers rapidly focus on the code
leading up to the error.
We do desire that all loggers follow the same output format,
so that all loggers produce compatible messages that can
be uniformly parsed by e.g. some awk script.
Typically an "if main" clause invokes
basicConfig,
just once, which takes care of that.

I can tell you what I do! I create a module in my application called logs that is imported by the entrypoint, creates and configures a root logger, and exposes a get_logger function that gets a child of that root logger with a given name. Consider:
# logs.py
import logging
ROOT = logging.getLogger("root")
# configure that logger in some way
def get_logger(name: str):
return ROOT.getLogger(name)
Then in other modules, I do:
# lib.py
import logs
logger = logs.get_logger(__name__)
# the body of my module, calling logger.info or etc as needed
The important part to remember is that logs.py cannot have dependencies on anything else in the package or else you have to be very careful the order in which you import things, lest you get a circular dependency problem!

Related

python logging from multiple packages?

I am importing 2 modules I built into a Python script. I want to have logging from the script plus the both modules go into a single log file. The logging cookbook and related forum posts (example), show how to do it for one script and one imported module, with the limitation that they both use the same logger name. The module's logger is named after the module (using __name__), so the script can find and re-use it using the top-level package name.
So one script and one imported module can share a logger. But how do I attach the second imported module to that logger? And what if I want the script's log name to be distinct from the imported module's name? Here is the solution I came up with. It works, but it's ugly. I need three logger objects! Is there a better way?
# Code for module maps.publish.upload
import logging
logger = logging.getLogger(__name__)
class ServicePublisher(object):
def __init__(self):
logger.debug(f'start constructor {__class__.__name__}')
Class charts.staffing.report.StaffingReport is identical to ServicePublisher
# log_test.py script
import logging
from pathlib import Path
from charts.staffing.report import StaffingReport
from maps.publish.upload import ServicePublisher
filename = Path(__file__).parts[-1]
logger = logging.getLogger(filename)
logger.setLevel(logging.DEBUG)
fh = logging.FileHandler('log_test.log')
fh.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(name)s - %(levelname)s - %(message)s')
fh.setFormatter(formatter)
logger.addHandler(fh)
logger_maps = logging.getLogger('maps')
logger_maps.setLevel(logging.DEBUG)
logger_maps.addHandler(fh)
logger_charts = logging.getLogger('charts')
logger_charts.setLevel(logging.DEBUG)
logger_charts.addHandler(fh)
logger.debug(f'start {filename}')
publisher = ServicePublisher()
report = StaffingReport()
Here is the log output, exactly as expected:
log_test.py - DEBUG - start log_test.py
maps.publish.upload - DEBUG - start constructor ServicePublisher
charts.staffing.report - DEBUG - start constructor StaffingReport
Is there a way to use my logger object for all three cases, so I don't need logger_maps and logger_charts, while still maintaining distinct names in the log output?
I worked on this a bit more and didn't really come up with a better solution, but I'll share what I learned below.
1. I tried going to the root manager's logger dict and creating a logger for everything there. You don't need to create loggers for sub-modules in your package structure. E.g.: charts.staffing.report will have a parent logger charts.staffing, which has a parent charts, which is all you need, so I removed anything with a . in it.
logger_names = [name for name in logging.root.manager.loggerDict
if '.' not in name]
Then, I used logging.getLogger(<name>) on each name in the list and attached my handlers to it. This was so simple that I was in love with this solution, until I tested it some more. I was using debug level logging, and this turned on logging from every imported package in some APIs I use. Those APIs also import packages with loggers, so this got really spammy. By default, I think those loggers all bubble up to the root logger at the warning level, which seems more appropriate. So my first lesson is to be intentional about which loggers you tinker with, rather than just going through all of them.
2. As I mentioned above, Python bubbles logging up from sub-modules to parent modules' loggers. If I had it to do all over again, I would name all my packages starting with an org name prefix. For example, with packages charts and maps, if I work for the Acme Corporation, then I would name these acme.charts and acme.maps so I can just create a single logger acme that handles all my packages. That's harder once you have lots of code using your package, so it's something to consider when you start.
Note that the convention is to use the package name for your loggers, but the name is just a string and you can name them whatever you want. I could do something like this to add an artificial prefix that unifies all of them:
logger = logging.getLogger(f"acme.{__name__}")
Even though the user still imports my example packages as charts and maps, the loggers would be named acme.charts and acme.maps, which you could get with a single logger named acme. It's a good technical trick, but I would have to document and communicate that to my users, since it's not intuitive that a package charts would use a logger named acme.
3. I'm still creating a logger for each custom package I'm importing. I don't think it will have any significant overhead for my code, but might be something to consider at massive scale. At scale, you wouldn't want to be logging below warning level, which you get by default from your root logger, so this may all be moot.
4. To simplify my code, I wrote a little function that accepts a list of package names and a list of handlers as inputs. It gets a logger for each name and attaches the handlers. For one or two additional loggers, it's probably just as easy to create them in your code as it is to call my def. But if you had a big list of packages to create loggers for, then it would really clean up your code.

How do you use logging in package which by default falls back to print in Python?

I want to change my print statements of my package to using logging. So I will write my scripts like
import logging
logger = logging.getLogger(__name__)
def func():
logger.info("Calling func")
which is the recommended way?
However, many users do not initialize logging and hence will not see the output.
Is there a recommended way so that users who do not initialize logging can see info output, and those who explicitly set up logging do not get their specific configuration tampered with by my package?
As a general rule of thumb, modules should never configure logging directly (and do other unsolicited changes to the shared STDOUT/STDERR either) as that's the realm of the module user. If the user wants the output, he should explicitly enable logging and then, and only then, your module should be allowed to log.
Bonus points if you provide an interface for explicitly enabling logging within your module so that the user doesn't have to explicitly change levels / disable loggers of your inner components if they're interested only in logging from their own code.
Of course, to keep using logging when a STDOUT/STDERR handler is not yet initialized you can use logging.NullHandler so all you have to do in your module is:
import logging
logger = logging.getLogger(__name__)
logger.addHandler(logging.NullHandler()) # initialize your logger with a NullHandler
def func():
logger.info("Calling func")
func() # (((crickets)))
Now you can continue using your logger throughout your module and until the user initializes logging your module won't trespass on the output, e.g.:
import your_module
your_module.func() # (((crickets)))
import logging
logging.root.setLevel(logging.INFO)
logging.info("Initialized!") # INFO:root:Initialized!
your_module.func() # INFO:your_module:Calling func
Of course, the actual logging format, level and other settings should also be in the realm of the user so if they change the format of the root log handler, it should be inherited by your module logger as well.

How to disable imported module logging at root level

I'm importing a module that is logging information at a warning level. I think that the person who wrote this code is logging at a root level i.e. in the code is just:
import logging
logging.warn("foo")
I've tried the below code but it doesn't work, probably because the logging is sent to root or something.
logging.getLogger(module).setLevel(logging.ERROR)
Is there a way that I could disable this module's specific logging?
There are several things confusing here:
logging.getLogger(module).setLevel(logging.ERROR)
The ‘module’ here should be a string, and is usually the full-qualified name of the module. Ie: package.module.
Depending of the format, the name of the Logger is usually printed in the log. That way you can disable it easily.
For instance, if you have something like:
WARNING [foo.bar.baz] the message
The logger name should be foo.bar.baz.
But, if you think it is the root logger, then you can try:
logger = logging.getLogger()
logger .setLevel(logging.ERROR)
Another thing which can be confusing, is the Python warnings. Take a look at this answer to disable them: https://stackoverflow.com/a/14463362

Should i use logging module or logging class?

I write big program with many modules. In same module I wish use logging. What best practice for logging in Python?
Should I use import standart logging module and use it in every my file:
#proxy_control.py
#!/usr/bin/env python3
import logging
class ProxyClass():
def check_proxy():
pass
logging.basicConfig(filename='proxy.log', level=logging.INFO)
logging.info('Proxy work fine')
Or maybe i should write one MyLog() class inherit from default logging and use it from all my other modules?
#proxy_control.py
#!/usr/bin/env python3
import MyLog
class ProxyClass():
def check_proxy():
pass
Mylog.send_log('Proxy work fine')
#my_log.py
#!/usr/env/bin python3
import logging
class MyLog():
def send_log(value):
pass
A typical convention is to define a logger at the top of every module that requires logging and then use that logger object throughout the module.
logger = logging.getLogger(__name__)
This way, loggers will follow your package names (ie. package.subpackage.module). This is useful in the logging module because loggers propagate messages upwards based on the logger name (ie. parent.child will pass messages up to parent). This means that you can do all your configuration at the top level logger and it will get messages from all the sub-loggers in your package. Or, if someone else is using your library, it will be very easy for them to configure which logging messages they get from your library because it will have a consistent namespace.
For a library, you typically don't want to show logging messages unless the user explicitly enables them. Python logging will automatically log to stderr unless you set up a handler, so you should add a NullHandler to your top-level logger. This will prevent the messages from automatically being printed to stderr.
logger = logging.getLogger(__name__)
logger.addHandler(logging.NullHandler())
NOTE - The NullHandler was added in Python2.7, for earlier versions, you'll have to implement it yourself.
Use the logging module, and leave logging configuration to your application's entry point (modules should not configure logging by themselves).

Ideas on dynamic python logging

I have an app which runs several instances of main app depending on external parameters. The main app imports few libraries which in turn import other libraries. They all have a global
LOGGER = logging.getLogger('module_name')
The logger is configured as file handler so all the logs get written to an external file (all logs written to same file). Now I want to write log to different files based on a certain name that is passed to the main app. I need some thing like
LOGGER = logging.getLogger(dynamic_criteria_name)
The result will be multiple log files dynamic_criteria_name.log, dynamic_criteria_name.log, etc created and any time logger is called from any of the module it should write to the correct file based on the criteria it was called under.
But the problem is the LOGGER is global. I can pass the logger or the dynamic_criteria_name to each function to write to log but it sounds wrong some how. May be I'm just paranoid! I have modules which have sometimes only functions in them. I don't want to pass the logger all around I guess.
I thought about AOP for a while but I don't think it is really cross cutting as it is dynamically generated the logger looks to me as cross cutting but within one instance of the main app. I have thought about other ways to hack global states but I think the dynamic nature makes it all not possible at least in my head.
Below is the pseudocode (I haven't tried it running it) but it explains better what I'm talking about I think. As you can see module_1 imports module_2 and module_3, they both have a global LOGGER, and module_3 calls module_4. I'd like to find out if I can write to the separate log file from module_2 or module_3 without passing name explicitly to each imported module function. I can add multiple handler to logger with different file name but how can I refer to the correct logger from another module given they are all global at the moment.
# module_1.py
import logging
import time
import module_2
import module_3
LOGGER = logging.getLogger()
def start_main_loop(name):
while True:
module_2.say_boo()
module_3.say_foo()
LOGGER.debug('Sleeping...')
time.sleep(10)
if __name__ == '__main__':
logging.basicConfig(level=logging.DEBUG)
for i in xrange(10):
start_main_loop(i)
#----------------------------------------------------
# module_2.py
import logging
LOGGER = logging.getLogger()
def say_boo():
msg = 'boo'
LOGGER.debug(msg)
LOGGER.debug(say_twice(msg))
def say_twice(msg):
LOGGER.debug('Called say twice')
return msg * 2
#----------------------------------------------------
# module_3.py
import logging
import module_4
LOGGER = logging.getLogger()
def say_foo():
msg = 'foo'
LOGGER.debug(msg)
LOGGER.debug(say_twice(msg))
module_4.say_bar()
def say_twice(msg):
LOGGER.debug('Called say twice')
return msg * 2
#----------------------------------------------------
# module_4.py
import logging
LOGGER = logging.getLogger()
def say_bar():
msg = 'bar'
LOGGER.debug(msg)
I'm willing to explore any ideas people might have. Hope I have explained myself clearly if not please let me know, I can rephrase the question. Thanks!
You don't need to pass loggers around: loggers are singletons. If code in multiple modules makes calls to getLogger('foo'), the same instance is returned for each call. So there is no need to pass logger instances around. In general the logger name indicates where a logging call originates from, so the __name__ value makes most sense. You could have code in a module foo log to a logger named bar - nothing prevents you doing this - it just hasn't been found especially useful in practice.
Sounds like AOP is overkill. Rather than passing logger names around, you might consider adding multiple handlers to the root logger, with specific filters on each handler to ensure that specific things go in specific files, according to your particular requirements.

Categories