How to get the configurations dict from a pre-configured logging object

How to get the configurations dict from a pre-configured logging object - python

I have a logging object that was initialised using logging.config.dictConfig(logging_conf_dict):
import logging.config
logging_conf_dict = {
'version': 1,
'disable_existing_loggers': False,
'formatters': {
'console': {
'format': '%(asctime)s %(name)s %(levelname)s %(lineno)d %(message)s',
},
},
'handlers': {
'console': {
'level': 'INFO',
'class': 'logging.StreamHandler',
'formatter': 'console',
},
},
'loggers': {
# root logger
'': {
'level': 'INFO',
'handlers': ['console'],
},
}
}
logging.config.dictConfig(logging_conf_dict)
For testing purposes I am trying to do the operation in reverse. I want to get the configurations dict from a configured logging object...
something like:
logging_conf_dict = logging.config.dump()
or
logging_conf_dict = get_configurationns(logging)
Is there a way to access the configurations dict of the logging object as a dictionary

Unfortunately, that information is not stored. As you can see in the source code, https://github.com/python/cpython/blob/3.9/Lib/logging/config.py, logging.config.dictConfig only stores the passed in config in a termporary dict which gets thrown away at the end of the function (starting line 805):
dictConfigClass = DictConfigurator
def dictConfig(config):
"""Configure logging using a dictionary."""
dictConfigClass(config).configure()
and looking at the configure method, the dict is not stored there either.
I'm guessing the reason they do this is that dictConfig is acting on global variables (the default logger).

Unfortunately the original dict configuration is not persisted during the load of the configuration within the logging module.
So your best bet might be creating your own custom parser which will retrieve available loggers and parse their handlers and formatters.
Something like this:
import logging
def parse_logging_dictconfig():
dict_config = {"loggers": {}}
level_name_map = {
logging.DEBUG: "DEBUG",
logging.INFO: "INFO",
# etc...
}
loggers = [logging.getLogger(name) for name in logging.root.manager.loggerDict]
for logger in loggers:
logger_config = {}
logger_config["level"] = level_name_map.get(logger.level)
logger_config["handlers"] = []
for handler in logger.handlers:
logger_config["handlers"].append(handler.name)
dict_config["loggers"][logger.name] = logger_config
return dict_config
The example is not complete, but based on that you can get the basic idea.

Related

Combining Python trace information and logging

I'm trying to write a highly modular Python logging system (using the logging module) and include information from the trace module in the log message.
For example, I want to be able to write a line of code like:
my_logger.log_message(MyLogFilter, "this is a message")
and have it include the trace of where the "log_message" call was made, instead of the actual logger call itself.
I almost have the following code working except for the fact that the trace information is from the logging.debug() call rather than the my_logger.log_message() one.
class MyLogFilter(logging.Filter):
def __init__(self):
self.extra = {"error_code": 999}
self.level = "debug"
def filter(self, record):
for key in self.extra.keys():
setattr(record, key, self.extra[key])
class myLogger(object):
def __init__(self):
fid = logging.FileHandler("test.log")
formatter = logging.Formatter('%(pathname)s:%(lineno)i, %(error_code)%I, %(message)s'
fid.setFormatter(formatter)
self.my_logger = logging.getLogger(name="test")
self.my_logger.setLevel(logging.DEBUG)
self.my_logger.addHandler(fid)
def log_message(self, lfilter, message):
xfilter = lfilter()
self.my_logger.addFilter(xfilter)
log_funct = getattr(self.logger, xfilter.level)
log_funct(message)
if __name__ == "__main__":
logger = myLogger()
logger.log_message(MyLogFilter, "debugging")
This is a lot of trouble to go through in order to make a simple logging.debug call but in reality, I will have a list of many different versions of MyLogFilter at different logging levels that contain different values of the "error_code" attribute and I'm trying to make the log_message() call as short and sweet as possible because it will be repeated numerous times.
I would appreciate any information about how to do what I want to, or if I'm completely off on the wrong track and if that's the case, what I should be doing instead.
I would like to stick to the internal python modules of "logging" and "trace" if that's possible instead of using any external solutions.

or if I'm completely off on the wrong track and if that's the case, what I should be doing instead.
My strong suggestion is that you view logging as a solved problem and avoid reinventing the wheel.
If you need more than the standard library's logging module provides, it's probably something like structlog (pip install structlog)
Structlog will give you:
data binding
cloud native structured logging
pipelines
...and more
It will handle most local and cloud use cases.
Below is one common configuration that will output colorized logging to a .log file, to stdout, and can be extended further to log to eg AWS CloudWatch.
Notice there is an included processor: StackInfoRenderer -- this will include stack information to all logging calls with a 'truthy' value for stack_info (this is also in stdlib's logging btw). If you only want stack info for exceptions, then you'd want to do something like exc_info=True for your logging calls.
main.py
from structlog import get_logger
from logging_config import configure_local_logging
configure_local_logging()
logger = get_logger()
logger.info("Some random info")
logger.debug("Debugging info with stack", stack_info=True)
try:
assert 'foo'=='bar'
catch Exception as e:
logger.error("Error info with an exc", exc_info=e)
logging_config.py
import logging
import structlog
def configure_local_logging(filename=__name__):
"""Provides a structlog colorized console and file renderer for logging in eg ING tickets"""
timestamper = structlog.processors.TimeStamper(fmt="%Y-%m-%d %H:%M:%S")
pre_chain = [
structlog.stdlib.add_log_level,
timestamper,
]
logging.config.dictConfig({
"version": 1,
"disable_existing_loggers": False,
"formatters": {
"plain": {
"()": structlog.stdlib.ProcessorFormatter,
"processor": structlog.dev.ConsoleRenderer(colors=False),
"foreign_pre_chain": pre_chain,
},
"colored": {
"()": structlog.stdlib.ProcessorFormatter,
"processor": structlog.dev.ConsoleRenderer(colors=True),
"foreign_pre_chain": pre_chain,
},
},
"handlers": {
"default": {
"level": "DEBUG",
"class": "logging.StreamHandler",
"formatter": "colored",
},
"file": {
"level": "DEBUG",
"class": "logging.handlers.WatchedFileHandler",
"filename": filename + ".log",
"formatter": "plain",
},
},
"loggers": {
"": {
"handlers": ["default", "file"],
"level": "DEBUG",
"propagate": True,
},
}
})
structlog.configure_once(
processors=[
structlog.stdlib.add_log_level,
structlog.stdlib.PositionalArgumentsFormatter(),
timestamper,
structlog.processors.StackInfoRenderer(),
structlog.processors.format_exc_info,
structlog.stdlib.ProcessorFormatter.wrap_for_formatter,
],
context_class=dict,
logger_factory=structlog.stdlib.LoggerFactory(),
wrapper_class=structlog.stdlib.BoundLogger,
cache_logger_on_first_use=True,
)
Structlog can do quite a bit more than this. I suggest you check it out.

It turns out the missing piece to the puzzle is using the "traceback" module rather than the "trace" one. It's simple enough to parse the output of traceback to pull out the source filename and line number of the ".log_message()" call.
If my logging needs become any more complicated then I'll definitely look into struct_log. Thank you for that information as I'd never heard about it before.

What does "()" do in python log config

I have seen a python dict log config in uvicorn's source code.
In that, they have defined formatters as
{
"default": {
"()": "uvicorn.logging.DefaultFormatter",
"fmt": "%(levelprefix)s %(asctime)s %(message)s",
"datefmt": "%Y-%m-%d %H:%M:%S",
},
"access": {
"()": "uvicorn.logging.AccessFormatter",
"fmt": '%(levelprefix)s %(asctime)s :: %(client_addr)s - "%(request_line)s" %(status_code)s',
"use_colors": True
},
}
also, we can see, they defined an empty logger ( not sure what should I call it) as,
"": {"handlers": ["default"], "level": "INFO"},
^^^^ - see, Empty key
So, here is my questions,
What does the "()" do in formatters section of python logger?
What does the "" do in loggers section python logger?

This dictionary is used to configure logging with logging.config.dictConfig().
The "()" key indicates that custom instantiation is required [source]:
In all cases below where a ‘configuring dict’ is mentioned, it will be checked for the special '()' key to see if a custom instantiation is required. If so, the mechanism described in User-defined objects below is used to create an instance; otherwise, the context is used to determine what to instantiate.
In the case of the formatter config in the OP's question, the "()" indicates that those classes should be used to instantiate a Formatter.
I do not see the empty string in the loggers section of the dictionary, but here are the related docs:
loggers - the corresponding value will be a dict in which each key is a logger name and each value is a dict describing how to configure the corresponding Logger instance.
The configuring dict is searched for the following keys:
level (optional). The level of the logger.
propagate (optional). The propagation setting of the logger.
filters (optional). A list of ids of the filters for this logger.
handlers (optional). A list of ids of the handlers for this logger.
The specified loggers will be configured according to the level, propagation, filters and handlers specified.
So a "" key in the loggers dictionary would instantiate a logger with the name "", like logging.getLogger("").
One might use a custom logging formatter for a variety of reasons. uvicorn uses a custom formatter to log different levels in different colors. The Python Logging Cookbook has an example of using a custom formatter to use UTC times instead of local times in logging messages.
import logging
import time
class UTCFormatter(logging.Formatter):
converter = time.gmtime
LOGGING = {
...
'formatters': {
'utc': {
'()': UTCFormatter,
'format': '%(asctime)s %(message)s',
},
'local': {
'format': '%(asctime)s %(message)s',
}
},
...
}
if __name__ == '__main__':
logging.config.dictConfig(LOGGING)
logging.warning('The local time is %s', time.asctime())
Here is the output. Note that in the first line, UTC time is used instead of local time, because the UTCFormatter is used.
2015-10-17 12:53:29,501 The local time is Sat Oct 17 13:53:29 2015
2015-10-17 13:53:29,501 The local time is Sat Oct 17 13:53:29 2015

Format Airflow Logs in JSON

I have a requirement to log the Apache Airflow logs to stdout in JSON format. Airflow does not seem to project this capability out of the box. I have found a couple python modules that are capable of this task, but I cannot get the implementation to work.
Currently, I am applying a class in airflow/utils/logging.py to modify the logger, shown below:
from pythonjsonlogger import jsonlogger
class StackdriverJsonFormatter(jsonlogger.JsonFormatter, object):
def __init__(self, fmt="%(levelname) %(asctime) %(nanotime) %(severity) %(message)", style='%', *args, **kwargs):
jsonlogger.JsonFormatter.__init__(self, fmt=fmt, *args, **kwargs)
def process_log_record(self, log_record):
if log_record.get('level'):
log_record['severity'] = log_record['level']
del log_record['level']
else:
log_record['severity'] = log_record['levelname']
del log_record['levelname']
if log_record.get('asctime'):
log_record['timestamp'] = log_record['asctime']
del log_record['asctime']
now = datetime.datetime.now().strftime('%Y-%m-%dT%H:%M:%S.%fZ')
log_record['nanotime'] = now
return super(StackdriverJsonFormatter, self).process_log_record(log_record)
I am implementing this code in /airflow/settings.py as shown below:
from airflow.utils import logging as logconf
def configure_logging(log_format=LOG_FORMAT):
handler = logconf.logging.StreamHandler(sys.stdout)
formatter = logconf.StackdriverJsonFormatter()
handler.setFormatter(formatter)
logging = logconf.logging.getLogger()
logging.addHandler(handler)
''' code below was original airflow source code
logging.root.handlers = []
logging.basicConfig(
format=log_format, stream=sys.stdout, level=LOGGING_LEVEL)
'''
I have tried a couple different variations of this and can't get the python-json-logger to transform the logs to JSON. Perhaps I'm not getting to the root logger? Another option I have considered is manually formatting the logs to a JSON string. No luck with that yet either. Any alternative ideas, tips, or support are appreciated.
Cheers!

I don't know if you ever solved this problem, but after some frustrating tinkering, I ended up getting this to play nice with airflow. For reference, I followed a lot of this article to get it working: https://www.astronomer.io/guides/logging/. The main issue was that the airflow logging only accepts a string template for the logging format, which json-logging can't plug into. So you have to create your own logging classes and connect it to a custom logging config class.
Copy the log template here into your $AIRFLOW_HOME/config folder, and change DEFAULT_CONFIG_LOGGING to CONFIG_LOGGING. When you're successful, bring up airflow and you'll get a log message on airflow startup that says Successfully imported user-defined logging config from logging_config.LOGGING_CONFIG. If this is the first .py file in the config folder don't forget to add a blank __init__.py file to get python to pick it up
Write your custom JsonFormatter to inject into your handler. I did mine off of this one.
Write the custom log handler classes. Since I was looking for JSON logging, mine look like this:
from airflow.utils.log.file_processor_handler import FileProcessorHandler
from airflow.utils.log.file_task_handler import FileTaskHandler
from airflow.utils.log.logging_mixin import RedirectStdHandler
from pythonjsonlogger import jsonlogger
class JsonStreamHandler(RedirectStdHandler):
def __init__(self, stream):
super(JsonStreamHandler, self).__init__(stream)
json_formatter = CustomJsonFormatter('(timestamp) (level) (name) (message)')
self.setFormatter(json_formatter)
class JsonFileTaskHandler(FileTaskHandler):
def __init__(self, base_log_folder, filename_template):
super(JsonFileTaskHandler, self).__init__(base_log_folder, filename_template)
json_formatter = CustomJsonFormatter('(timestamp) (level) (name) (message)')
self.setFormatter(json_formatter)
class JsonFileProcessorHandler(FileProcessorHandler):
def __init__(self, base_log_folder, filename_template):
super(JsonFileProcessorHandler, self).__init__(base_log_folder, filename_template)
json_formatter = CustomJsonFormatter('(timestamp) (level) (name) (message)')
self.setFormatter(json_formatter)
class JsonRotatingFileHandler(RotatingFileHandler):
def __init__(self, filename, mode, maxBytes, backupCount):
super(JsonRotatingFileHandler, self).__init__(filename, mode, maxBytes, backupCount)
json_formatter = CustomJsonFormatter('(timestamp) (level) (name) (message)')
self.setFormatter(json_formatter)
Hook them up to the logging configs in your custom logging_config.py file.
'handlers': {
'console': {
'class': 'logging_handler.JsonStreamHandler',
'stream': 'sys.stdout'
},
'task': {
'class': 'logging_handler.JsonFileTaskHandler',
'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
'filename_template': FILENAME_TEMPLATE,
},
'processor': {
'class': 'logging_handler.JsonFileProcessorHandler',
'base_log_folder': os.path.expanduser(PROCESSOR_LOG_FOLDER),
'filename_template': PROCESSOR_FILENAME_TEMPLATE,
}
}
...
and
DEFAULT_DAG_PARSING_LOGGING_CONFIG = {
'handlers': {
'processor_manager': {
'class': 'logging_handler.JsonRotatingFileHandler',
'formatter': 'airflow',
'filename': DAG_PROCESSOR_MANAGER_LOG_LOCATION,
'mode': 'a',
'maxBytes': 104857600, # 100MB
'backupCount': 5
}
}
...
And json logs should be output, both in the DAG logs and the output as well.
Hope this helps!

Django log messages not processing correctly. Messages to multiple files

So I have two loggers in my Django project. One for authentication failures, and one which includes those but also includes messages for when something is edited (basically everything for which I have a logger command).
I seem to have a bit of a problem in modules where I want to use both loggers however. My two loggers are currently defined like so:
'': {
'handlers': ['file'],
'level': 'INFO',
'propagate': True,
},
'auth': {
'handlers': ['file_auth'],
'level': 'CRITICAL',
'propagate': True,
}
And my handlers are:
'file': {
'level': 'INFO',
'class': 'logging.FileHandler',
'filename': '/home/debug.log',
'formatter': 'simple',
},
'file_auth': {
'level': 'CRITICAL',
'class': 'logging.FileHandler',
'filename': '/home/debug2.log',
'formatter': 'verbose',
},
At the top of my Django view, I have
import logging
logger = logging.getLogger('')
logger = logging.getLogger('auth')
Then within one view I have a logger.info(message).
If I change this to logger.critical(message), the message appears in both log files but if it remains as logger.info, nothing happens at all.
(Probably useless information...at the start of my LOGGING section in settings.py, I have:
'version': 1,
'disable_existing_loggers': False,
Not sure if they have any relevance. But previously I was struggling to get the errors to appear in both files until I switched the order of introducing them which magically changed stuff - I really don't understand why that would make a difference though)
Would be really grateful if someone could help me out....Probably really simple but I must admit I don't really understand how it works..

You have two different loggers, named '' (the root logger) and 'auth'. The order in which your two statements appear:
logger = logging.getLogger('')
logger = logging.getLogger('auth')
obviously makes a difference when you call
logger.info(...)
as in the two cases you will be calling the method on two different loggers. You might wish to change your code to
root_logger = logging.getLogger('')
logger = logging.getLogger('auth')
and then call methods on either root_logger or logger as appropriate.

Django logging project and app namespace

I've an issue with Django logging. By the way answers to this question will help me to clarify how Django namespaces work.
Here is the structure of my project:
MyProject:
-App1:
....
views.py
-App2:
....
urls.py
settings.py
I like to log all messages in one file. Then I've setup in settings.py the following logger:
LOGGING = {
'version': 1,
'disable_existing_loggers': True,
'formatters': {
'verbose': {
#'format': '%(levelname)-8s %(remote_addr)-15s %(path_info)s %(asctime)s %(name)-20s %(funcName)-15s %(message)s'
'format': '%(levelname)-8s %(asctime)s %(name)-20s %(funcName)-15s %(message)s'
},
},
'handlers': {
'normal': {
'level': 'DEBUG',
'class': 'logging.FileHandler',
'formatter': 'verbose',
'filename': os.path.join('C:/dev/Instantaneus/Instantaneus/html/static', 'log', 'normal.log')
},
'console': {
'level': 'DEBUG',
'class': 'logging.StreamHandler',
'formatter': 'verbose',
},
},
'loggers': {
'MyProject': {
'handlers': ['normal','console'],
'level': 'DEBUG',
#'filters': ['request'],
'propagate': True,
},
}
}
In urls.py:
from MyProject.App1.views import EvenementDetailView,
....
url(r'App1/(?P<pk>\d+)/$', login_required(EvenementDetailView.as_view(model=Evenement)), name='evenement_details'),
and in App1/views.py:
from django.views.generic import DetailView
import logging
logger = logging.getLogger(__name__)
....
class EvenementDetailView(DetailView):
print __name__
model=Evenement
....
logger.debug('blabla')
In the browser, when I call http://localhost/App1/3, the following appears in the console:
DEBUG 2012-01-18 14:59:04,503 Myproject.evenements.views EvenementDetailView blabla
MyProject.evenements.views
evenements.views
Then my question is why the print __name__ code is executed twice and most important why outputs are not the same ?
I suppose that the DEBUG log appear only one time because evenement.views can't propagate to MyProject because in this case root is evenements
Any ideas ?
Solution, not with in depth explanation but it works:
In my urls.py, I had a line url(r'App1/(?P<pk>\d+)/activate/$', 'app1.views.activate') where 'activate' is a function in App1/views.py. I've change 'App1.views.activate' in 'MyProject.app1.views.activate' and it works fine. I've only one line in the console for print __name__. I think I've only one line because of the 'disable_existing_loggers': True, but what I can't explain is that this solution made my views.py only parse one time instead of two times before. To be sure of that I've added a print "blabla" at the beginning of the file. In first case he's printed two times and only onces in second case.

Fair warning: I'm not sure this is correct, and I'm basing some of this on stuff I remember reading a long time ago, but can't find in google now.
What you're seeing is a side-effect of how the python import mechanism works. When a module is imported, it's put into sys.modules, however, when it's possible to import the module under two different dotted paths (in this case, with or without MyProject) it can be imported twice, once under each __name__.
The underlying fix is to make sure MyProject it not on sys.path - the directory containing MyProject should be, but not MyProject itself. You can verify that this is done by starting a manage.py shell and making sure import evenements fails. There are some Django internals in manage.py that could make this difficult - but the last time I ran into this was back around 1.0 or 1.1, so it may well have been fixed.
There's an in-depth discussion here:
http://blog.dscpl.com.au/2010/03/improved-wsgi-script-for-use-with.html
It's a long article, search for "two different names".

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to get the configurations dict from a pre-configured logging object - python

Related

Combining Python trace information and logging

What does "()" do in python log config

Format Airflow Logs in JSON

Django log messages not processing correctly. Messages to multiple files

Django logging project and app namespace

Categories

Resources