Using `str.format` to template log messages - python

I'm trying to use str.format style templating in my logging. Can't seem to get it working properly.
>>> import logging
>>> logging.basicConfig(filename='/tmp/example', format='{asctime} - {levelname} - {message}', style='{', level=logging.INFO)
>>> logger = logging.getLogger(__name__)
>>> logger.warning('blah')
>>> logger.warning('{foo:03d}', {'foo': 42})
Actual output:
2017-02-23 16:11:45,695 - WARNING - blah
2017-02-23 16:12:11,432 - WARNING - {foo:03d}
Expected output:
2017-02-23 16:11:45,695 - WARNING - blah
2017-02-23 16:12:11,432 - WARNING - 042
What am I missing in this setup?
I'm not interested to see workarounds that format the string before it's logged, or Python 2 solutions which use old %-style templating.

Apparently the style argument only applies to information about messages (such as a timestamp, severity, etc.) and not to actual messages.
From the docstring of logger.warning:
warning(msg, *args, **kwargs) method of logging.Logger instance
Log 'msg % args' with severity 'WARNING'.
It seems that the msg is always formatted using old-style formatting, so the style argument of the logger is not even considered.
The logging HOWTO contains a bit more information:
... you cannot directly make logging calls using str.format() or
string.Template syntax, because internally the logging package uses
%-formatting to merge the format string and the variable arguments.
There would no changing this while preserving backward compatibility,
since all logging calls which are out there in existing code will be
using %-format strings.

"I'm not interested to see workarounds that format the string before it's logged" Why not? The old style formats it before it's logged also... You can put the line to do the formatting inside the logger.warning call and it changes nothing functionally. If {foo:42} is just one member of a much larger dictionary you can do something like this:
for key,val in warningsDictionary.iteritems():
logger.warning('{'+key+':'+format(val,'03')+'}')
Whether or not this is sufficient is dependent on what you actually are trying to do.

Related

Declare python module in yaml

I have a yaml file which has some fields with values that are understandable in python, but they get parsed as string values, not that python type I meant. This is my sample:
verbose:
level: logging.DEBUG
and obviously when I load it, the value is string type
config = yaml.load(args.config.read(), Loader=yaml.SafeLoader)
I have no idea how to get exactly logging.DEBUG object, not its string.
Note that I don't look for configuring logging to get logger thing. This logging is just a sample of python module.
There's no out of the box way for that. The simplest and safest way seems to be processing the values manually, e.g:
import logging
class KnownModules:
logging = logging
...
def parse_value(s):
v = KnownModules
for p in s.split('.'):
v = getattr(v, p) # remember to handle AttributeError
return v
However, if you're ok with slightly changing your YAML structure, PyYAML supports some custom YAML tags. For example:
verbose:
level: !!python/name:logging.DEBUG
will make config['verbose']['level'] equal to logging.DEBUG (i.e. 10).
Considering that you're (correctly) using SafeLoader, you may need to combine those methods by defining your own tag.
The YAML loader has no knowledge of what logging.DEBUG might mean except a string "logging.DEBUG" (unless it's tagged with a YAML tag).
For string values that need to be interpreted as e.g. references to module attributes, you will need to parse them after-the-fact, e.g.
def parse_logging_level(level_string: str):
module, _, value = level_string.partition(".")
assert module == "logging"
return logging._nameToLevel[value]
# ...
yaml_data["verbose"]["level"] = parse_logging_level(yaml_data["verbose"]["level"])
Edit: Please see AKX answer. I was not aware of logging._nameToLevel which does not require defining your own enum and is definitely better than using evel. But, I decided to not delete this answer as I think the current preferred design (as of python 3.4) which uses enums is worth mentioning (it would probably be used in the logging module if it was available back then).
If you are absolutely sure that the values provided in the config are legitimate ones, you can use eval like this:
import logging
levelStr = 'logging.DEBUG'
level = eval(levelStr)
But as said in the comments, if you are not sure about the values present in the config file, using eval could be disasterous (see the example provided by AKX in the comments).
A better design is to define an enum for this purpose. Unfortunately the logging module does not provide the levels as enum (they are just constants defined in the module), thus you should define your own.
from enum import Enum
class LogLevel(Enum):
CRITICAL = 50
FATAL = 50
ERROR = 40
WARNING = 30
WARN = 30
INFO = 20
DEBUG = 10
NOTSET = 0
and then you can use it like this:
levelStr = 'DEBUG'
levelInt = LogLevel[levelStr].value # Comparable with logging.DEBUG which is also an integer
But to use this you have to change your yml file a bit and replace logging.DEBUG with DEBUG.

Formatting the message in Python logging

I would like to understand if it's possible and how is it possible, to modify the message part of a log message, using the Python logging module.
So basically, you can format a complete log as:
format = '{"timestamp": "%(asctime)s", "logger_level": "%(levelname)s", "log_message": %(message)s}'
However, I would like to make sure the message part is always in json format. Is there any way I can modify the format of only the message part, maybe with a custom logging.Formatter?
Thank you.
The format specification %(message)s tells Python you want to format a string. Try it with %(message)r and it should do the job:
>>> logging.error('{"log_message": %r}', {"a": 55})
ERROR:root:{"log_message": {'a': 55}}
There's an example in the Logging Cookbook which shows one way of doing this.Basically:
import json
import logging
class StructuredMessage:
def __init__(self, message, /, **kwargs):
self.message = message
self.kwargs = kwargs
def __str__(self):
return '%s >>> %s' % (self.message, json.dumps(self.kwargs))
_ = StructuredMessage # optional, to improve readability
logging.basicConfig(level=logging.INFO, format='%(message)s')
logging.info(_('message 1', foo='bar', bar='baz', num=123, fnum=123.456))
Of course, you can adapt this basic idea to do something closer to what you want/need.
Update: The formatting only happens if the message is actually output. Also, it won't apply to logging from third-party libraries. You would need to subclass Logger before importing any other modules which import logging to achieve that, but it's a documented approach.

Following Python Lazy Logging Formatting, but pylint is still showing (logging-not-lazy) messages

I'm logging some info messages and wanted to make sure I was following the PEP standards. Pylint says that this line:
LOG.info('Directories have been made in /home/bushbak2/projects/'+
'system_file_audit/%s/', manu)
Is not following the lazy logging format of python.
Could it be because the line carries over?
This is the message I get from pylint:
auto_audit.py:157:8: W1201: Use lazy % formatting in logging functions (logging-not-lazy)
Why does pylint still raise this message? Am I not appropriately following the PEP standard?
Cheers,
Brendan
The problem with the + between the two strings is that adds an add operation that must be evaluated in runtime before calling LOG.debug.
This can be a performance problem if you have to evaluate this line too often even if the debug logging flag is not set for your program.
This is the abstract syntax tree for logging.debug("this is " + "a test %s", 123):
As you can see, the first argument of the function call always has an add operation.
On the other hand, if we use logging.debug("this is " "a test %s", 123) (with whitespace instead of a + sign). The python compiler produces this other AST:
What happens with f-strings? A lot of people (including me) is in love with python f-strings. If we pass an f-string, the result is similar to the one with the plus sign.
The AST for logging.debug("this is a test {123}") shows python must perform a string operation before the function call:
Your problem comes from the + in your expression. Pylint sees it as a string concatenation and thus, recommends you to do this concatenation lazily.
You can fix it by removing the +:
LOG.info('Directories have been made in /home/bushbak2/projects/'
'system_file_audit/%s/', manu)

'Wildcard' for checking captured log outputs using Python's testfixtures module

I'm writing some unit tests for a server program which catches most exceptions, but logs them, and would like to make assertions on the logged output. I've found the testfixtures package useful to this end; for example:
import logging
import testfixtures
with testfixtures.LogCapture() as l:
logging.info('Here is some info.')
l.check(('root', 'INFO', 'Here is some info.'))
Following the documentation, the check method will raise an error if either the logger name, level, or message is not as expected.
I would like to perform a more 'flexible' kind of test in which I make assertions on the message using a wildcard for the other elements of the tuple. This less stringent assertion would look something like
l.check((*, *, 'Here is some info.'))
but this is not valid syntax. Is there any way to specify a 'wildcard' in the check method of the testfixtures.logcapture.LogCapture class?
The way to check messages only (which, as pointed out to me by the author, is actually described in the documentation) is to use the records attribute of the LogCapture class, which is a list of logging.LogRecord objects. So the appropriate assertion is:
assert l.records[-1].getMessage() == 'Here is some info.'

PyLint message: logging-format-interpolation

For the following code:
logger.debug('message: {}'.format('test'))
pylint produces the following warning:
logging-format-interpolation (W1202):
Use % formatting in logging functions and pass the % parameters as
arguments Used when a logging statement has a call form of
“logging.(format_string.format(format_args...))”. Such
calls should use % formatting instead, but leave interpolation to the
logging function by passing the parameters as arguments.
I know I can turn off this warning, but I'd like to understand it. I assumed using format() is the preferred way to print out statements in Python 3. Why is this not true for logger statements?
It is not true for logger statement because it relies on former "%" format like string to provide lazy interpolation of this string using extra arguments given to the logger call. For instance instead of doing:
logger.error('oops caused by %s' % exc)
you should do
logger.error('oops caused by %s', exc)
so the string will only be interpolated if the message is actually emitted.
You can't benefit of this functionality when using .format().
Per the Optimization section of the logging docs:
Formatting of message arguments is deferred until it cannot be avoided. However, computing the arguments passed to the logging method can also be expensive, and you may want to avoid doing it if the logger will just throw away your event.
Maybe this time differences can help you.
Following description is not the answer for your question, but it can help people.
If you want to use fstrings (Literal String Interpolation) for logging, then you can disable it from .pylintrc file with disable=logging-fstring-interpolation, see: related issue and comment.
Also you can disable logging-format-interpolation.
For pylint 2.4:
There are 3 options for logging style in the .pylintrc file: old, new, fstr
fstr option added in 2.4 and removed in 2.5
Description from .pylintrc file (v2.4):
[LOGGING]
# Format style used to check logging format string. `old` means using %
# formatting, `new` is for `{}` formatting,and `fstr` is for f-strings.
logging-format-style=old
for old (logging-format-style=old):
foo = "bar"
self.logger.info("foo: %s", foo)
for new (logging-format-style=new):
foo = "bar"
self.logger.info("foo: {}", foo)
# OR
self.logger.info("foo: {foo}", foo=foo)
Note: you can not use .format() even if you select new option.
pylint still gives the same warning for this code:
self.logger.info("foo: {}".format(foo)) # W1202
# OR
self.logger.info("foo: {foo}".format(foo=foo)) # W1202
for fstr (logging-format-style=fstr):
foo = "bar"
self.logger.info(f"foo: {foo}")
Personally, I prefer fstr option because of PEP-0498.
In my experience a more compelling reason than optimization (for most use cases) for the lazy interpolation is that it plays nicely with log aggregators like Sentry.
Consider a 'user logged in' log message. If you interpolate the user into the format string, you have as many distinct log messages as there are users. If you use lazy interpolation like this, the log aggregator can more reasonably interpret this as the same log message with a bunch of different instances.
Here is an example of why it's better to use %s instead of f-strings in logging.
>>> import logging
>>> logging.basicConfig(level=logging.INFO)
>>> logger = logging.getLogger('MyLogger')
>>>
>>> class MyClass:
... def __init__(self, name: str) -> None:
... self._name = name
... def __str__(self) -> str:
... print('GENERATING STRING')
... return self._name
...
>>> c = MyClass('foo')
>>> logger.debug('Created: %s', c)
>>> logger.debug(f'Created: {c}')
GENERATING STRING
Inspired by Python 3.7 logging: f-strings vs %.
Might be several years after but having to deal with this the other day, I made simple; just formatted the string before logger.
message = 'message: {}'.format('test')
logger.debug(message)
That way there was no need to change any of the settings from log, if later on desire to change to a normal print there is no need to change the formatting or code.
"logging-format-interpolation (W1202)" is another one wrong recommendation from pylint (like many from pep8).
F-string are described as slow vs %, but have you checked ?
With 500_000 rotation of logging with f-string vs % -> f-string:23.01 sec. , %:25.43 sec.
So logging with f-string is faster than %.
When you look at the logging source code : log.error() -> self.logger._log() -> self.makeRecord() -> self._logRecordFactory() -> class LogRecord() -> home made equivalent to format()
code :
import logging
import random
import time
loops = 500_000
r_fstr = 0.0
r_format = 0.0
def test_fstr():
global loops, r_fstr
for i in range(0, loops):
r1 = time.time()
logging.error(f'test {random.randint(0, 1000)}')
r2 = time.time()
r_fstr += r2 - r1
def test_format():
global loops, r_format
for i in range(0 ,loops):
r1 = time.time()
logging.error('test %d', random.randint(0, 1000))
r2 = time.time()
r_format += r2 - r1
test_fstr()
test_format()
print(f'Logging f-string:{round(r_fstr,2)} sec. , %:{round(r_format,2)} sec.')

Categories