Grouping '.format()'-style logging messages in Sentry - python

I use SentryHandler from raven.handlers.logging to track any logs of higher level within Sentry. My logging messages are dynamically populated with custom content with .format(), so the text message itself doesn't necessarily always have the same content. For example:
import logging
from raven.handlers.logging import SentryHandler
from raven.conf import setup_logging
# Create a "basic" logger
logger = logging.getLogger("root")
# Create a Sentry logger handler
sh = SentryHandler("https://******#sentry.io/******")
sh.setLevel(logging.WARNING)
setup_logging(sh)
# Send the desired message to Sentry via logger
if SomeInteresetingWarning():
logger.warning("{} missing files in {} directiories!".format(num_files,num_dirs))
All good, only this causes every unique message to be considered as a unique Warning, which of course isn't true.
There is a nice QA covering this very problem on GitHub, but the solution provided there only applies to the strings, formatted with old-fashioned %s-style.
Does anybody know how to implement proper Sentry message grouping (aggregating) without having to redesign string formatting from format() back to %s placeholders?

Now you can:
https://docs.python.org/3/howto/logging-cookbook.html#use-of-alternative-formatting-styles
The gist is to use a formatter with style="{".

Related

How to get the current python logging config?

I am using the python logging library to configure my loggers with an input dict, like this:
logging.config.dictConfig(config)
I have a special function that should use a new function, and then switch back to the original logger at the end of the function. The original logger can vary, so I do not want to hardcode it. Some psuedocode to describe what I want:
def switch_logger_for_this_code():
old_logging_config = logger.get_current_logging_config(). # This is what I want to accomplish
logging.config.dictConfig(new_logging_config)
logging.info('This log goes to the new config!')
logging.config.dictConfig(old_logging_config)
logging.info('This log goes to the old config!')
return
Is this possible to do with the logging library?
I don't believe there is any built in way of retrieving the config of an initialized logger and storing it into a variable. I'm not sure if this would work for your project, but have you considered just creating a temporary logger object for use within the function's scope?
import logging
from sys import stdout
def switch_logger_for_this_code(msg) -> None:
# Create logging.Formatter for output customization
formatter = logging.Formatter(
fmt="[%(asctime)s.%(msecs)d] %(message)s",
datefmt="%Y/%m/%d %H:%M:%S"
)
handler = logging.StreamHandler(stdout) # Create output handler
handler.setLevel(logging.ERROR) # Specify logging level
handler.setFormatter(formatter) # Apply desired format
temp_log = logging.getLogger("temp") # Get new or existing Logger
temp_log.addHandler(handler) # Add output handler
temp_log.error(msg) # Log the message
switch_logger_for_this_code("hello world") # Log 'hello world'
# Which would output the following
# [2020/10/20 15:09:18.548] hello world
The main idea behind this approach is that the original logger object won't be modified, but you can still use the temporary logger to log what you had originally intended in your desired format.
I just looked through the source code for the logging module, and I don't see any way to get at the internal configuration, which starts as the passed in configuration, but can then be modified by other calls to the logging system.

Logging each API calls into a separate file

I have a Django Application whose each API calls are associated with a transaction_id. I want to create separate log files for each transactions_id.
In simple words I want to have multiple files which I will be using for logging.
How can i do this using Django's built in logging system ?
I can have multiple handlers in a single logger. But as per my requirement FileHandlers has to added in run time whose file name will be the transaction_id. This can be done. But the problem is if I have 4 transactions running at a time, 4 handlers will be added to the same logger, and according to documentation logs will be sent to each handlers resulting in 1 transaction log file logging the rest 3 transaction's logs as well.
Following is what I have come with:
class TransactionLogger:
def __init__(self, transaction_id):
self.logger = logging.getLogger('transaction_logger')
logger = self.logger
fileHandler = logging.FileHandler(transaction_id, mode='a')
formatter = logging.Formatter('%(levelname)s %(asctime)s %(filename)s:%(lineno)s - %(funcName)s() ] %(message)s')
fileHandler.setFormatter(formatter)
self.logger.addHandler(fileHandler)
self.logger.propagate = False
At the beginning of each transactions I instantiate the logger as :
logger = TransactionLogger(transaction_id).logger
and log as follows:
logger.debug("Hello World")
How can I maintain n number of log files which will be generated dynamically and log into each file based on transaction_id without interfering with other files.
Any help is appreciated.
I would not say that its a good design to store logs like this. A better approach would be to write custom format to have transaction id in each log via which you will be able to filter your all logs.
Still here are are two ways you can achieve this:
1) By using logging._acquireLock() and logging._releaseLock() , OR you can use locking via LOCK as explained here.
2) Create a new logger every time (by inheriting logging.Manager and adding new logger to self.loggerDict ) and delete it at the end of the execution (so that system doesn't go out of memory).

Getting logs twice in AWS lambda function

I'm attempting to create a centralized module to set up my log formatter to be shared across a number of python modules within my lambda function. This function will ultimately be run on AWS Greengrass on a local on-premise device.
For some reason, when I add in my own handler to format the messages the logs are being outputted twice - once at the correct log level and the second time at an incorrect level.
If I use the standard python logger without setting up any handlers it works fine e.g.
main.py:
import logging
logging.debug("test1")
cloudwatch logs:
12:28:42 [DEBUG]-main.py:38,test1
My objective is to have one formatter on my code which will format these log messages into JSON. They will then get ingested into a centralized logging database. However, when I do this I get the log messages twice.
loghelper.py:
def setup_logging(name):
formatter = logging.Formatter("%(name)s, %(asctime)s, %(message)s")
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(formatter)
logger = logging.getLogger(name)
if logger.handlers:
for handler in logger.handlers:
logger.removeHandler(handler)
logger.setLevel(logging.DEBUG)
logger.addHandler(handler)
return logger
main.py:
import logging
logger = loghelper.setup_logging('main.test_function')
def test_function():
logger.debug("test function log statement")
test_function()
When the lambda function is now run I get the debug message twice in the cloud watch logs as follows:
cloudwatch logs:
12:22:53 [DEBUG]-main.py:5, test function log statement
12:22:53 [INFO]-__init__.py:880,main.test_function,2018-06-18 12:22:53,099, test function log statement
Notice that:
The first entry is at the correct level but in the wrong format.
The second entry reports the wrong level, the wrong module but is in the correct format.
I cannot explain this behavior and would appreciate any thoughts on what could be causing it. I also don't know which constructor exists at line 880. This may shed some light on what is happening.
References:
Setting up a global formatter:
How to define a logger in python once for the whole program?
Clearing the default lambda log handlers:
Using python Logging with AWS Lambda
Creating a global logger:
Python: logging module - globally
AWS Lambda also sets up a handler, on the root logger, and anything written to stdout is captured and logged as level INFO. Your log message is thus captured twice:
By the AWS Lambda handler on the root logger (as log messages propagate from nested child loggers to the root), and this logger has its own format configured.
By the AWS Lambda stdout-to-INFO logger.
This is why the messages all start with (asctime) [(levelname)]-(module):(lineno), information; the root logger is configured to output messages with that format and the information written to stdout is just another %(message) part in that output.
Just don't set a handler when you are in the AWS environment, or, disable propagation of the output to the root handler and live with all your messages being recorded as INFO messages by AWS; in the latter case your own formatter could include the levelname level information in the output.
You can disable log propagation with logger.propagate = False, at which point your message is only going to be passed to your handler, not to to the root handler as well.
Another option is to just rely on the AWS root logger configuration. According to this excellent reverse engineering blog post the root logger is configured with:
logging.Formatter.converter = time.gmtime
logger = logging.getLogger()
logger_handler = LambdaLoggerHandler()
logger_handler.setFormatter(logging.Formatter(
'[%(levelname)s]\t%(asctime)s.%(msecs)dZ\t%(aws_request_id)s\t%(message)s\n',
'%Y-%m-%dT%H:%M:%S'
))
logger_handler.addFilter(LambdaLoggerFilter())
logger.addHandler(logger_handler)
This replaces the time.localtime converter on logging.Formatter with time.gmtime (so timestamps use UTC rather than locatime), sets a custom handler that makes sure messages go to the Lambda infrastructure, configures a format, and adds a filter object that only adds aws_request_id attribute to records (so the above formatter can include it) but doesn't actually filter anything.
You could alter the formatter on that handler by updating the attributes on the handler.formatter object:
for handler in logging.getLogger().handlers:
formatter = handler.formatter
if formatter is not None and 'aws_request_id' in formatter._fmt:
# this is the AWS Lambda formatter
# formatter.datefmt => '%Y-%m-%dT%H:%M:%S'
# formatter._style._fmt =>
# '[%(levelname)s]\t%(asctime)s.%(msecs)dZ'
# '\t%(aws_request_id)s\t%(message)s\n'
and then just drop your own logger handler entirely. You do want to be careful with this; AWS Lambda infrastructure could well be counting on a specific format being used. The output you show in your question doesn't include the date component (the %Y-%m-%dT part of the formatter.datefmt string) which probably means that the format has been parsed out and is being presented to you in a web application view of the data.
I'm not sure whether this is the cause of your problem, but by default, Python's loggers propagate their messages up to logging hierarchy. As you probably know, Python loggers are organized in a tree, with the root logger at the top and other loggers below it. In logger names, a dot (.) introduces a new hierarchy level. So if you do
logger = logging.getLogger('some_module.some_function`)
then you actually have 3 loggers:
The root logger (`logging.getLogger()`)
A logger at module level (`logging.getLogger('some_module'))
A logger at function level (`logging.getLogger('some_module.some_function'))
If you emit a log message on a logger and it is not discarded based on the loggers minimum level, then the message is passed on to the logger's handlers and to its parent logger. See this flowchart for more information.
If that parent logger (or any logger higher up in the hierarchy) also has handlers, then they are called, too.
I suspect that in your case, either the root logger or the main logger somehow ends up with some handlers attached, which leads to the duplicate messages. To avoid that, you can set propagate in your logger to False or only attach your handlers to the root logger.

Multi-line logging in Python

I'm using Python 3.3.5 and the logging module to log information to a local file (from different threads). There are cases where I'd like to output some additional information, without knowing exactly what that information will be (e.g. it might be one single line of text or a dict).
What I'd like to do is add this additional information to my log file, after the log record has been written. Furthermore, the additional info is only necessary when the log level is error (or higher).
Ideally, it would look something like:
2014-04-08 12:24:01 - INFO - CPU load not exceeded
2014-04-08 12:24:26 - INFO - Service is running
2014-04-08 12:24:34 - ERROR - Could not find any active server processes
Additional information, might be several lines.
Dict structured information would be written as follows:
key1=value1
key2=value2
2014-04-08 12:25:16 - INFO - Database is responding
Short of writing a custom log formatter, I couldn't find much which would fit my requirements. I've read about filters and contexts, but again this doesn't seem like a good match.
Alternatively, I could just write to a file using the standard I/O, but most of the functionality already exists in the Logging module, and moreover it's thread-safe.
Any input would be greatly appreciated. If a custom log formatter is indeed necessary, any pointers on where to start would be fantastic.
Keeping in mind that many people consider a multi-line logging message a bad practice (understandably so, since if you have a log processor like DataDog or Splunk which are very prepared to handle single line logs, multi-line logs will be very hard to parse), you can play with the extra parameter and use a custom formatter to append stuff to the message that is going to be shown (take a look to the usage of 'extra' in the logging package documentation).
import logging
class CustomFilter(logging.Filter):
def filter(self, record):
if hasattr(record, 'dct') and len(record.dct) > 0:
for k, v in record.dct.iteritems():
record.msg = record.msg + '\n\t' + k + ': ' + v
return super(CustomFilter, self).filter(record)
if __name__ == "__main__":
logging.getLogger().setLevel(logging.DEBUG)
extra_logger = logging.getLogger('extra_logger')
extra_logger.setLevel(logging.INFO)
extra_logger.addFilter(CustomFilter())
logging.debug("Nothing special here... Keep walking")
extra_logger.info("This shows extra",
extra={'dct': {"foo": "bar", "baz": "loren"}})
extra_logger.debug("You shouldn't be seeing this in the output")
extra_logger.setLevel(logging.DEBUG)
extra_logger.debug("Now you should be seeing it!")
That code outputs:
DEBUG:root:Nothing special here... Keep walking
INFO:extra_logger:This shows extra
foo: bar
baz: loren
DEBUG:extra_logger:Now you should be seeing it!
I still recommend calling the super's filter function in your custom filter, mainly because that's the function that decides whether showing the message or not (for instance, if your logger's level is set to logging.INFO, and you log something using extra_logger.debug, that message shouldn't be seen, as shown in the example above)
I just add \n symbols to the output text.
i'm using a simple line splitter in my smaller applications:
for line in logmessage.splitlines():
writemessage = logtime + " - " + line + "\n"
logging.info(str(writemessage))
Note that this is not thread-safe and should probably only be used in log-volume logging applications.
However you can output to log almost anything, as it will preserve your formatting. I have used it for example to output JSON API responses formatted using: json.dumps(parsed, indent=4, sort_keys=True)
It seems that I made a small typo when defining my LogFormatter string: by accidentally escaping the newline character, I wrongly assumed that writing multi-line output to a log file was not possible.
Cheers to #Barafu for pointing this out (which is why I assigned him the correct answer).
Here's the sample code:
import logging
lf = logging.Formatter('%(levelname)-8s - %(message)s\n%(detail)s')
lh = logging.FileHandler(filename=r'c:\temp\test.log')
lh.setFormatter(lf)
log = logging.getLogger()
log.setLevel(logging.DEBUG)
log.addHandler(lh)
log.debug('test', extra={'detail': 'This is a multi-line\ncomment to test the formatter'})
The resulting output would look like this:
DEBUG - test
This is a multi-line
comment to test the formatter
Caveat:
If there is no detail information to log, and you pass an empty string, the logger will still output a newline. Thus, the remaining question is: how can we make this conditional?
One approach would be to update the logging formatter before actually logging the information, as described here.

Setting up advanced Python logging

I'd like to use logging for my modules, but I'm not sure how to design the following requirements:
normal logging levels (info, error, warning, debug) but also some additional more verbose debug levels
logging messages can have different types; some are meant for the developer, some are meant for the user; those types go to different outputs
errors should go to stderr
I also need to keep track which module/function/code line wrote a debug message so that I can activate or deactivate individual debug messages in a configuration
I need to keep track if errors occured at all to eventually execute a sys.exit() at the end of the program
all messages should go to stdout until the loggers are set up
I've read the logging documentation, but I'm not sure what's the most streamline way to use the logging module with the requirements above (how to use concept of Logger, Handler, Filter, ...). Can you point out an idea to set this up? (e.g. write module with two loggers 'user', 'developer'; derive from Logger; do getLogger(__name__); keep error flag like this,... etc.)
1) Adding more verbose debug levels.
Have you thought this through?
Take a look about what the doc says:
Defining your own levels is possible, but should not be necessary, as the existing levels have been chosen on the basis of practical experience. However, if you are convinced that you need custom levels, great care should be exercised when doing this, and it is possibly a very bad idea to define custom levels if you are developing a library. That's because if multiple library authors all define their own custom levels, there is a chance that the logging output from such multiple libraries used together will be difficult for the using developer to control and/or interpret, because a given numeric value might mean different things for different libraries.
Take also a look at When to use logging, there are two very good tables explaining when to use what.
Anyway, if you think you'll need those extra logging levels, take a look at: logging.addLevelName().
2) Some logging messages for the developer, and some for the user
Use different loggers family with different handlers. At the base of each family set Logger.propagate to False.
3) Errors should go to stderr
This already happen by default with StreamHandler:
class logging.StreamHandler(stream=None)
Returns a new instance of the StreamHandler class. If stream is specified, the instance will use it for logging output; otherwise, sys.stderr will be used.
4) Keep track of the source of a log message
Get Loggers with different names, and in your Formatter use format strings with %(name)s.
5) All messages should go to stdout until the loggers are set up
The setup of your logging system should be one of the very first things to do, so I don't really see what this means. If you need to send messages to stdout use print as it should be and already explained in When to use logging.
Last advice: carefully read the Logging Cookbook as it covers pretty well what you need.
From the comment: How would I design to have output to different sources and also filter my module?
I wouldn't filter in the first place, filers are hard to maintain and if they are all in one place that place will have to hold too much information. Every module should get abd set its own Logger (with its own handlers or filters) using or not its parent setting.
Very quick example:
# at the very beginning
root = logging.getLogger()
fallback_handler = logging.StreamHandler(stream=sys.stdout)
root.addHandler(fallback_handler)
# first.py
first_logger = logging.getLogger('first')
first_logger.parent = False
# ... set 'first' logger as you wish
class Foo:
def __init__(self):
self.logger = logging.getLogger('first.Foo')
def baz(self):
self.logger.info("I'm in baz")
# second.py
second_logger = logging.getLogger('first.second') # to use the same settings
# third.py
abstract_logger = logging.getLogger('abs')
abstract_logger.parent = False
# ... set 'abs' logger
third_logger = logging.getLogger('abs.third')
# ... set 'abs.third' particular settings
# fourth.py
fourth_logger = logging.getLogger('abs.fourth')
# [...]

Categories