Intercept all logging in Python - python

I am new to Python and even newer with Python logging.
My situation is, I have a system where I need to trace the data through the process. So I decided to use the Python's logging system itself to trace the information.
Initially I created a new Logging Handler, where in the emit function it send the logger to another server, only if the logger has in its extra attributes some variable I am using to trace.
So far so good, in all log (from debug to critical) I can trace the data. My problem is that, if someone set the LogLevel to critical and I am tracing a data using the LogLevel info, I won't get the trace, since the log won't be processed.
I thought in two solutions. First, create a custom LogLevel to use for the trace, which I think isn't the right choice. The second, that I believe is the right one, is to intercept all logs, and check if there is that extra variable in it. If the log has it, doesn't matter the log level, I will send the log to the server any ways.
Since I am new with Python, I can't understand how the log system works. Do I need another function in my handler? Do I need to create a custom LogRecord?
class RQHandler(logging.Handler):
def __init__(
self, formatter=JSONFormatter(), level=logging.NOTSET,
connection_pool=None
):
# run the regular Handler __init__
logging.Handler.__init__(self, level)
self.formatter = formatter
def emit(self, record):
# Send to the other
...

In the logging module, Handler is a subclass of Filterer, which has method filter(self, record)
It looks like a simple solution might be to override the filter method in RQHandler.
Look through the original Filterer code to see what it's doing first, but you should be able to override it to return True to force the LogRecord to always be emitted.

Related

How to dynamically change logging file output in Python

How would you dynamically change the file where logs are written to in Python, using the standard logging package?
I have a single process multi-threaded application that processes tasks for specific logical bins. To help simplify debugging and searching the logs, I want each bin to have its own separate log file. Due to memory usage and scaling concerns, I don't want to split the process into multiple processes whose output I could otherwise easily redirect to a separate log. However, by default, Python's logging package only outputs to a single location, either stdout/stderr or or some other single file.
My question's similar to this question except I'm not trying to change the logging level, just the logging output destination.
you will need to create a different logger for each thread and configure each logger to it's own file.
You can call something like this function in each thread, with the appropiate bin_name:
def create_logger(bin_name, level=logging.INFO):
handler = logging.FileHandler(f'{bin_name}.log')
handler.setFormatter(logging.Formatter('%(asctime)s %(levelname)s %(message)s'))
bin_logger = logging.getLogger(bin_name)
bin_logger .setLevel(level)
bin_logger .addHandler(handler)
return bin_logger

Python switch a logger off by default

In Python, I want to optionally log from a module - but have this logging off by default, enabled with a function call. (The output from this file will be very spammy - so best off by default)
I want my code to look something like this.
log = logging.getLogger("module")
log.switch_off()
---
import module
module.log.switch_on()
I can't seem to find an option to disable a logger.
Options considered:
Using filters: I think this is a bit confusing for the client
Setting a level higher than one I use to log: (e.g. logging.CRITICAL). I don't like that we could inadvertently throw log lines into normal output if we use that level.
Use a flag and add ifs
Require the client to exclude our log events. See logging config
There are two pieces at play here. Python has logging.Logger objects and logging.Handler objects that work together to serve you logging information. Loggers handle the logic of collecting logging information, and deciding whether logs should be emitted to associated handlers. If the logging level of your log record is less severe than the level specified in the logger, it will not pass info to associated handlers.
Handlers have the same feature, and since handlers are the last line between log records and defined output, you would likely want to disable the interaction there. To accomplish this, and avoid having logs inadvertently logged elsewhere, you can add a new logging level to your application:
logging.addLevelName(logging.CRITICAL + 1, "DISABLELOGGING")
Note: This only maps the name to value for purposes of formatting, so you will need to add a member to the logging module as well:
logging.DISABLELOGGING = logging.CRITICAL + 1
Setting it to a value higher than CRITICAL ensures that no normal log event will pass and be emitted.
Then you just need to set your handler to the level you defined:
handler.setLevel(logging.DISABLELOGGING)
and now there should be no logs that pass the handler, and therefore no output shown.

How can I combine the python logging cookbook example of logging network events with a logging.config.listen thread?

Theoretically this should be simple. Taking the example from the logging cookbook here:
https://docs.python.org/3/howto/logging-cookbook.html#sending-and-receiving-logging-events-across-a-network
I want to add the ability to change the logging configuration on the fly. I simply added:
logging.config.dictConfig(...) # setup the root logger
config_thread = logging.config.listen()
config_thread.start()
tcpserver = LogRecordSocketReceiver()
and on startup, this works fine with the provided example of sending log events across the network to the socket receiver.
However, the problem occurs once I send in a new configuration. After that the log server won't produce any more logging messages. That happens even though each handleLogRecord() call gets a new instance of the logger through logging.getLogger().
Any ideas as to what I'm missing?
You need to ensure that in the configuration dictionary, you have disable_existing_loggers set to False. Otherwise, when a new configuration is applied, the existing loggers will be disabled and not produce any more output.

Context-dependent log level in Python

I'm prototyping a web application framework in Python (mostly for educative purposes) and I'm stuck on one feature I've wanted for such a long time: per-route log level.
The goal of this feature is to identify some specific entry points for which we're performing diagnostics. For example, I want to track what's going on when callers hit POST /sessions/login. Now, I want to get 100% of log entries for code hit by request processing for this URL. And this means everything, including whatever goes on in 3rd-party applications.
Example: fictional application has two routes: /sessions/login and /sessions/info. Both request handlers hit the same database code in package users, which uses logger myapp.users.db. Request processing for /sessions/login should emit log messages on logger myapp.users.db, but request processing for /sessions/info should not.
The problem is that this doesn't fit well with Python's logging library, which decomposes logging in a hierarchical fashion, which is nice for layering (e.g. controlling the log level by application layers).
What I really want is a context-dependent log level. The natural implementation that comes to mind is something that makes logger.getEffectiveLevel() return a thread-local log level (with debug middleware conditionally lowering the log level to debug if the request URL is subject to debugging). However, I'm looking at the logging flow in the Python documentation, and I don't understand how to implement this using any of the many different types of configuration hooks.
Question: how would you implement a context-dependent log level in Python?
Update: I found a partial solution.
context = threading.local()
class ContextualLogger(logging.Logger):
def getEffectiveLevel(self):
global context
level = getattr(context, 'log_level', logging.NOTSET)
if level == logging.NOTSET:
level = super(ContextualLogger, self).getEffectiveLevel()
return level
logging.setLoggerClass(ContextualLogger)
However, this doesn't work for the root logger. Any ideas?
Update: it's also possible to monkey patch the getEffectiveLevel() function.
context = threading.local()
# Monkey patch "getEffectiveLevel()" to consult the current setting in the
# `context.log_level` thread-local storage. If that value is present, use
# it to override the current value; else, compute the level using the usual
# infrastructure.
default_getEffectiveLevel = logging.Logger.getEffectiveLevel
def patched_getEffectiveLevel(self):
level = getattr(context, 'log_level', logging.NOTSET)
if level == logging.NOTSET:
level = default_getEffectiveLevel(self)
return level
logging.Logger.getEffectiveLevel = patched_getEffectiveLevel
Now, this works even for the root logger. I have to admit that I'm a little uncomfortable with monkey patching this function, but then again it falls back onto the usual infrastructure so it's actually not as dirty as it looks.
You're better off using a logging.Filter which is attached to your loggers (or handlers) which uses the context to either drop the event (by returning False from the filter method) or allow the event to be logged (by returning True from the filter method).
Though not exactly for your use case, I illustrated use of filters with thread-local context in this post.

Advantages of logging vs. print() + logging best practices

I'm currently working on 1.0.0 release of pyftpdlib module.
This new release will introduce some backward incompatible changes in
that certain APIs will no longer accept bytes but unicode.
While I'm at it, as part of this breackage, I was contemplating the
possibility to get rid of my logging functions, which currently use the
print statement, and use the logging module instead.
As of right now pyftpdlib delegates the logging to 3 functions:
def log(s):
"""Log messages intended for the end user."""
print s
def logline(s):
"""Log commands and responses passing through the command channel."""
print s
def logerror(s):
"""Log traceback outputs occurring in case of errors."""
print >> sys.stderr, s
The user willing to customize logs (e.g. write them to a file) is
supposed to just overwrite these 3 functions as in:
>>> from pyftpdlib import ftpserver
>>>
>>> def log2file(s):
... open('ftpd.log', 'a').write(s)
...
>>> ftpserver.log = ftpserver.logline = ftpserver.logerror = log2file
Now I'm wondering: what benefits would imply to get rid of this approach
and use logging module instead?
From a module vendor perspective, how exactly am I supposed to
expose logging functionalities in my module?
Am I supposed to do this:
import logging
logger = logging.getLogger("pyftpdlib")
...and state in my doc that "logger" is the object which is supposed
to be used in case the user wants to customize how logs behave?
Is it legitimate to deliberately set a pre-defined format output as in:
FORMAT = '[%(asctime)] %(message)s'
logging.basicConfig(format=FORMAT)
logger = logging.getLogger('pyftpdlib')
...?
Can you think of a third-party module I can take cues from where the logging functionality is exposed and consolidated as part of the public API?
Thanks in advance.
libraries (ftp server or client library) should never initialize the logging system.
So it's ok to instantiate a logger object and to point at logging.basicConfig in the
documentation (or provide a function along the lines of basicConfig with fancier output
and let the user choose among his logging configuration strategy, plain basicConfig or
library provided configuration)
frameworks (e.g. django) or servers (ftp server daemon)
should initialize the logging system to a reasonable
default and allow for customization of logging system configuration.
Typically libraries should just create a NullHandler handler, which is simply a do nothing handler. The end user or application developer who uses your library can then configure the logging system. See the section Configuring Logging for a Library in the logging documentation for more information. In particular, see the note which begins
It is strongly advised that you do not add any handlers other than NullHandler to your library's loggers.
In your case I would simply create a logging handler, as per the logging documentation,
import logging
logging.getLogger('pyftpdlib').addHandler(logging.NullHandler())
Edit The logging implementation sketched out in the question seems perfectly reasonable. In your documentation just mention logger and discuss or point users to the logging.setLevel and logging.setFormatter methods for customising the output from your library. Rather than using logging.basicConfig(format=FORMAT) you could consider using logging.config.fileConfig to manage the settings for your output and document the configuration file somewhere in your documentation, again pointing the user to the logging module documentation for the format expected in this file.
Here is a resource I used to make a customizable logger. I didn't change much, I just added an if statement, and pass in whether or not I want to log to a file or just the console.
Check this Colorer out. It's really nice for colorizing the output so DEBUG looks different than WARN which looks different than INFO.
The Logging module bundles a heck of a lot of nice functionality, like SMTP logging, file rotation logging (so you can save a couple old log files, but not make 100s of them every time something goes wrong).
If you ever want to migrate to Python 3, using the logging module will remove the need to change your print statements.
Logging is awesome depending on what you're doing, I've only lightly used it before to see where I am in a program (if you're running this function, color this way), but it has significantly more power than a regular print statement.
You can look at Django (just create a sample project) and see how it initialize logger subsystem.
There is also a contextual logger helper that I've written some time ago - this logger automatically takes name of module/class/function is was initialized from. This is very useful for debug messages where you can see right-through that module spits the messages and how the call flow goes.

Categories