I have, hopefully, a simple question on python/django logging.
I am writing a Django app, which calls some modules that are shared with other processes, jobs, scripts and possibly even some other apps in due course - for example common calculations.
I have configured logging for my Django app in settings.py and then in my main app modules (views.py, etc) I get the logger via:
logger = logging.getLogger('exampleApp')
Everything works as expected.
However, when my app calls modules in other packages that are shared I obviously do not want to have the same logging.getLogger('exampleApp') call because then other processes that use this module will use an incorrect logger config.
Furthermore using
logging.getLogger(__name__)
means that when I call this module from my Django app, it doesn't log to the correct file.
So my question is, for shared modules, is there a way for the logger to use the instance in the 'parent caller', in other words a Django app or a standalone server process, or script and so on? Or is it as simple as having to pass the logger instance into classes defined in the module (as a param in the constructor)?
Hope that makes sense.
Thanks.
The way I have got this to work, without passing around logger objects, is to change
logger = logging.getLogger('exampleApp')
in the top level module to
logger = logging.getLogger()
so that it actually gets the root logger. This root logger format will be followed by other modules.
Related
I have written a simple python package which has a set of functions that perform simple operations (data manipulation). I am trying to enhance the package and add more functionality for logging, which leads me to this question.
Should I expect the user of the package to pass in a file descriptor or a file handler of the python logging module into the methods of the package, or should the package itself have its own logging module which the methods within the package employ.
I can see benefits (user controls logging and can maintain a flow of function calls based on the same handler) and cons (users logger is not good enough) in both, however what is / are best practices in this case.
In your module, create a logger object:
import logging
LOGGER = logging.getLogger(__name__)
And then call the appropriate functions on that object:
LOGGER.debug('you probably dont care about this')
LOGGER.info('some info message')
LOGGER.error('whoops, something went wrong')
If the user has configured the logging subsystem correctly, then the messages will automatically go where the user wants them to go (a file, stderr, syslog, etc.)
The python logging module has a common pattern (ex1, ex2) where in each module you get a new logger object for each python module.
I'm not a fan of blindly following patterns and so I would like to understand a little bit more.
Why get a new logger object in each new module?
Why not have everyone just use the same root logger and configure the formatter with %(module)s?
Is there examples where this pattern is NECESSARY/NEEDED (i.e. because of some sort of performance reason[1])?
[1]
In a multi-threaded python program, is there some sort of hidden synchronization issues that is fixed by using multiple logging objects?
Each logger can be configured separately. Generally, a module logger is not configured at all in the module itself. You create a distinct logger and use it to log messages of varying levels of detail. Whoever uses the logger decides what level of messages to see, where to send those messages, and even how to display them. They may want everything (DEBUG and up) from one module logged to a file, while another module they may only care if a serious error occurs (in which case they want it e-mailed directly to them). If every module used the same (root) logger, you wouldn't have that kind of flexibility.
The logger name defines where (logically) in your application events occur. Hence, the recommended pattern
logger = logging.getLogger(__name__)
uses logger names which track the Python package hierarchy. This in turn allows whoever is configuring logging to turn verbosity up or down for specific loggers. If everything just used the root logger, one couldn't get fine grained control of verbosity, which is important when systems reach a certain size / complexity.
The logger names don't need to exactly track the package names - you could have multiple loggers in certain packages, for example. The main deciding factor is how much flexibility is needed (if you're writing an application) and perhaps also how much flexibility your users need (if you're writing a library).
I am inspecting the logging.Logger.manager.loggerDict by doing:
import logging
logging.Logger.manager.loggerDict
and the dict is as follows:
{
'nose.case': <celery.utils.log.ProcessAwareLoggerobjectat0x112c8dcd0>,
'apps.friends': <logging.PlaceHolderobjectat0x1147720d0>,
'oauthlib.oauth2.rfc6749.grant_types.client_credentials': <celery.utils.log.ProcessAwareLoggerobjectat0x115c48710>,
'apps.adapter.views': <celery.utils.log.ProcessAwareLoggerobjectat0x116a847d0>,
'apps.accounts.views': <celery.utils.log.ProcessAwareLoggerobjectat0x116976990>,
}
There are more but I truncated it
My questions are :
How come celery is involved in logging of various other non-celery apps? Is it because logging is done in an async way and somehow logging framework detects presence of celery and uses it?
For two of my own files that are logging using logger = logging.getLogger(__name__) , I see one is PlaceHolderObject and other two it is celery.utils.log.ProcessAwareLogger object - although these latter two are called in views and not in celery processes. How did it become this way then
Thanks
Celery itself replaces the (global) logger class, using the logging.setLoggerClass method, with a ProcessAwareLogger class that does a couple of things: avoid trying to log while in a signal handler, and add a process name to logs. This happens as soon as Celery's logging system is set up. You're seeing this class even on your own loggers because of the global nature of setLoggerClass.
As for why, exactly, Celery is designed like that, I think you'd have to ask a developer of Celery, but effectively it allows Celery to ensure that signal handler safety and process name are taken care of even if you use your own loggers in your app.
The python logging docs note:
If you are implementing asynchronous signal handlers using the signal module, you may not be able to use logging from within such handlers. This is because lock implementations in the threading module are not always re-entrant, and so cannot be invoked from such signal handlers.
Celery uses signal so this may be a reason for wanting to globally enforce its logger class.
I recently found some tutorials for the logging module in python, and I am not sure whether it is enough to use e.g. logging.info() or if I will need the Logger object.
import logging
logging.basicConfig(filename="test.log", level=logging.DEBUG)
logging.info("test logging")
logger = logging.getLogger(__name__)
logger.info("test logger")
Both messages will go to test.log, but instead of root, the second one will say __main__.
e.g. the how-to first explains logging without the Logger object in the basic tutorial. The advanced tutorial then introduces the Logger object and says:
Loggers expose the interface that application code directly uses.
Maybe someone could give me a tiny example of what that means, or what the advantage of the objects really are?
I want to add a custom logger to a python application I am working on. I think that the logging module intends to support this kind of customization but I don't really see how it works with the typical way of using the logging module. Normally, you would create a logger object in a module like,
import logging
logger = logging.getLogger(__name__)
However, somewhere in the application, probably the entry-point, I will have to tell the logging module to use my logger class,
logging.setLoggerClass(MyLogger)
However, this is often going to be called after modules have been imported and the logger objects are already allocated. I can think of a couple of ways to work around this problem (using a manager class to register logger allocations or calling getLogger() for each log record -- yuk), but this does not feel right and I wanted to know what the right way of doing this is.
Any logging initialisation / settings / customisation should be done before the application code runs. You can do this by putting it in the __init__.py file of your main application directory (this means it'll run before any of your modules are imported / read).
You can also put it in a settings.py module and importing that module as the first thing in your application. As long as the logging setup code runs before any getLogger calls are made then you're good.