Customize key value in python structured (json) logging from config file - python

I have to output my python job's logs as structured (json) format for our downstream datadog agent to pick them up. Crucially, I have requirements about what specific log fields are named, e.g. there must be a timestamp field which cannot be called e.g. asctime. So a desired log looks like:
{"timestamp": "2022-11-10 00:28:58,557", "name": "__main__", "level": "INFO", "message": "my message"}
I can get something very close to that with the following code:
import logging.config
from pythonjsonlogger import jsonlogger
logging.config.fileConfig("logging_config.ini", disable_existing_loggers=False)
logger = logging.getLogger(__name__)
logger.info("my message")
referencing the following logging_config.ini file:
[loggers]
keys = root
[handlers]
keys = consoleHandler
[formatters]
keys=json
[logger_root]
level=DEBUG
handlers=consoleHandler
[handler_consoleHandler]
class=StreamHandler
level=DEBUG
formatter=json
[formatter_json]
class = pythonjsonlogger.jsonlogger.JsonFormatter
format=%(asctime)s %(name)s - %(levelname)s:%(message)s
...however, this doesn't allow flexibility about the keys in the outputted log json objects. e.g. my timestamp object is called "asctime" as below:
{"asctime": "2022-11-10 00:28:58,557", "name": "__main__", "levelname": "INFO", "message": "my message"}
I still want that asctime value (e.g. 2022-11-10 00:28:58,557), but need it to be referenced by a key called "timestamp" instead of "asctime". If at all possible I would strongly prefer a solution that adapts the logging.config.ini file (or potentially a yaml logging config file) with relatively minimal extra python code itself.
I also tried this alternative python json logging library which I thought provided very simple and elegant code, but unfortunately when I tried to use that, I didn't get my log statement to output at all...

You'll need to have a small, minimal amount of Python code, something like
# in mymodule.py, say
class CustomJsonFormatter(jsonlogger.JsonFormatter):
def add_fields(self, log_record, record, message_dict):
super(CustomJsonFormatter, self).add_fields(log_record, record, message_dict)
log_record['timestamp'] = datetime.datetime.fromtimestamp(record.created).strftime('%Y-%m-%d %H:%M:%S') + f',{int(record.msecs)}'
and then change the configuration to reference it:
[formatter_json]
class = mymodule.CustomJsonFormatter
format=%(timestamp)s %(name)s - %(levelname)s:%(message)s
which would then output e.g.
{"timestamp": "2022-11-10 11:37:25,153", "name": "root", "levelname": "DEBUG", "message": "foo"}

Related

How to add a timestamp and loglevel to each log in Python's structlog?

How to configure structlog so it automatically adds loglevel and a timestamp (and maybe other fields) by default to each log message it logs? So I do not have to add it to every message explicitly.
I am displaying my messages as JSON (for further processing with Fluentd, Elasticsearch and Kibana). loglevel is not (for some reason) included in the output JSON log.
That is how I confiure my structlog.
structlog.configure(
processors=[structlog.processors.JSONRenderer()],
wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
)
I am logging:
log.info("Artist saved", spotify_id=id)
Logs I am seeing (mind no time and no loglevel):
{"logger": "get_artists.py", "spotify_id": "4Y6z2aIww27vnxZz9xfG3S", "event": "Artist saved"}
I found my answer here: Python add extra fields to structlog-based formatters within logging
There are processors that are doing exactly what I needed:
structlog.configure(
processors=[
structlog.processors.add_log_level,
structlog.processors.TimeStamper(fmt="iso", key="ts"),
structlog.processors.JSONRenderer(),
],
wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
)
Adding both, add_log_level and TimeStamper resulted, as expected in the extra fields in the log ..., "level": "info", "ts": "2022-04-17T19:21:56.426093Z"}.

How to implement json format logs in python

I have a below piece of code for python logging and would want to convert the logs into json format for better accessibility of information. How can I convert them into JSON format?
import os
import logging
log_fmt = ("%(asctime)-s %(levelname)-s %(message)s")
logger = logging.getLogger()
logger.setLevel(os.environ.get('LOG_LEVEL', 'INFO'))
logger.info(f"this is a test")
And the output looks like "2022-04-20 17:40:31,332 INFO this is a test"
How can I format this into a json object so I can access by keys?
Desired output:
{
"time": "2022-04-20 17:40:31,332",
"level": "INFO",
"message": "this is a test"
}
You could use the Python JSON Logger
But if you don't want to, or can't do that, then your log format string should be...
log_fmt = ("{\"time\": %(asctime)-s, \"level\": %(levelname)-s, \"message\": %(message)s},")
You'll end up with an extra comma at the end of the log file that you can programatically remove later. Or, you can do this if you want the comma at the top of the file...
log_fmt = (",{\"time\": %(asctime)-s, \"level\": %(levelname)-s, \"message\": %(message)s}")
But the json will look better in an editor with the comma at the end of every line.
If you provide a mechanism for users to download, or otherwise access log files, then you can do the trailing comma cleanup there, before you send the log file to the user.

Combining Python trace information and logging

I'm trying to write a highly modular Python logging system (using the logging module) and include information from the trace module in the log message.
For example, I want to be able to write a line of code like:
my_logger.log_message(MyLogFilter, "this is a message")
and have it include the trace of where the "log_message" call was made, instead of the actual logger call itself.
I almost have the following code working except for the fact that the trace information is from the logging.debug() call rather than the my_logger.log_message() one.
class MyLogFilter(logging.Filter):
def __init__(self):
self.extra = {"error_code": 999}
self.level = "debug"
def filter(self, record):
for key in self.extra.keys():
setattr(record, key, self.extra[key])
class myLogger(object):
def __init__(self):
fid = logging.FileHandler("test.log")
formatter = logging.Formatter('%(pathname)s:%(lineno)i, %(error_code)%I, %(message)s'
fid.setFormatter(formatter)
self.my_logger = logging.getLogger(name="test")
self.my_logger.setLevel(logging.DEBUG)
self.my_logger.addHandler(fid)
def log_message(self, lfilter, message):
xfilter = lfilter()
self.my_logger.addFilter(xfilter)
log_funct = getattr(self.logger, xfilter.level)
log_funct(message)
if __name__ == "__main__":
logger = myLogger()
logger.log_message(MyLogFilter, "debugging")
This is a lot of trouble to go through in order to make a simple logging.debug call but in reality, I will have a list of many different versions of MyLogFilter at different logging levels that contain different values of the "error_code" attribute and I'm trying to make the log_message() call as short and sweet as possible because it will be repeated numerous times.
I would appreciate any information about how to do what I want to, or if I'm completely off on the wrong track and if that's the case, what I should be doing instead.
I would like to stick to the internal python modules of "logging" and "trace" if that's possible instead of using any external solutions.
or if I'm completely off on the wrong track and if that's the case, what I should be doing instead.
My strong suggestion is that you view logging as a solved problem and avoid reinventing the wheel.
If you need more than the standard library's logging module provides, it's probably something like structlog (pip install structlog)
Structlog will give you:
data binding
cloud native structured logging
pipelines
...and more
It will handle most local and cloud use cases.
Below is one common configuration that will output colorized logging to a .log file, to stdout, and can be extended further to log to eg AWS CloudWatch.
Notice there is an included processor: StackInfoRenderer -- this will include stack information to all logging calls with a 'truthy' value for stack_info (this is also in stdlib's logging btw). If you only want stack info for exceptions, then you'd want to do something like exc_info=True for your logging calls.
main.py
from structlog import get_logger
from logging_config import configure_local_logging
configure_local_logging()
logger = get_logger()
logger.info("Some random info")
logger.debug("Debugging info with stack", stack_info=True)
try:
assert 'foo'=='bar'
catch Exception as e:
logger.error("Error info with an exc", exc_info=e)
logging_config.py
import logging
import structlog
def configure_local_logging(filename=__name__):
"""Provides a structlog colorized console and file renderer for logging in eg ING tickets"""
timestamper = structlog.processors.TimeStamper(fmt="%Y-%m-%d %H:%M:%S")
pre_chain = [
structlog.stdlib.add_log_level,
timestamper,
]
logging.config.dictConfig({
"version": 1,
"disable_existing_loggers": False,
"formatters": {
"plain": {
"()": structlog.stdlib.ProcessorFormatter,
"processor": structlog.dev.ConsoleRenderer(colors=False),
"foreign_pre_chain": pre_chain,
},
"colored": {
"()": structlog.stdlib.ProcessorFormatter,
"processor": structlog.dev.ConsoleRenderer(colors=True),
"foreign_pre_chain": pre_chain,
},
},
"handlers": {
"default": {
"level": "DEBUG",
"class": "logging.StreamHandler",
"formatter": "colored",
},
"file": {
"level": "DEBUG",
"class": "logging.handlers.WatchedFileHandler",
"filename": filename + ".log",
"formatter": "plain",
},
},
"loggers": {
"": {
"handlers": ["default", "file"],
"level": "DEBUG",
"propagate": True,
},
}
})
structlog.configure_once(
processors=[
structlog.stdlib.add_log_level,
structlog.stdlib.PositionalArgumentsFormatter(),
timestamper,
structlog.processors.StackInfoRenderer(),
structlog.processors.format_exc_info,
structlog.stdlib.ProcessorFormatter.wrap_for_formatter,
],
context_class=dict,
logger_factory=structlog.stdlib.LoggerFactory(),
wrapper_class=structlog.stdlib.BoundLogger,
cache_logger_on_first_use=True,
)
Structlog can do quite a bit more than this. I suggest you check it out.
It turns out the missing piece to the puzzle is using the "traceback" module rather than the "trace" one. It's simple enough to parse the output of traceback to pull out the source filename and line number of the ".log_message()" call.
If my logging needs become any more complicated then I'll definitely look into struct_log. Thank you for that information as I'd never heard about it before.

What does "()" do in python log config

I have seen a python dict log config in uvicorn's source code.
In that, they have defined formatters as
{
"default": {
"()": "uvicorn.logging.DefaultFormatter",
"fmt": "%(levelprefix)s %(asctime)s %(message)s",
"datefmt": "%Y-%m-%d %H:%M:%S",
},
"access": {
"()": "uvicorn.logging.AccessFormatter",
"fmt": '%(levelprefix)s %(asctime)s :: %(client_addr)s - "%(request_line)s" %(status_code)s',
"use_colors": True
},
}
also, we can see, they defined an empty logger ( not sure what should I call it) as,
"": {"handlers": ["default"], "level": "INFO"},
^^^^ - see, Empty key
So, here is my questions,
What does the "()" do in formatters section of python logger?
What does the "" do in loggers section python logger?
This dictionary is used to configure logging with logging.config.dictConfig().
The "()" key indicates that custom instantiation is required [source]:
In all cases below where a ‘configuring dict’ is mentioned, it will be checked for the special '()' key to see if a custom instantiation is required. If so, the mechanism described in User-defined objects below is used to create an instance; otherwise, the context is used to determine what to instantiate.
In the case of the formatter config in the OP's question, the "()" indicates that those classes should be used to instantiate a Formatter.
I do not see the empty string in the loggers section of the dictionary, but here are the related docs:
loggers - the corresponding value will be a dict in which each key is a logger name and each value is a dict describing how to configure the corresponding Logger instance.
The configuring dict is searched for the following keys:
level (optional). The level of the logger.
propagate (optional). The propagation setting of the logger.
filters (optional). A list of ids of the filters for this logger.
handlers (optional). A list of ids of the handlers for this logger.
The specified loggers will be configured according to the level, propagation, filters and handlers specified.
So a "" key in the loggers dictionary would instantiate a logger with the name "", like logging.getLogger("").
One might use a custom logging formatter for a variety of reasons. uvicorn uses a custom formatter to log different levels in different colors. The Python Logging Cookbook has an example of using a custom formatter to use UTC times instead of local times in logging messages.
import logging
import time
class UTCFormatter(logging.Formatter):
converter = time.gmtime
LOGGING = {
...
'formatters': {
'utc': {
'()': UTCFormatter,
'format': '%(asctime)s %(message)s',
},
'local': {
'format': '%(asctime)s %(message)s',
}
},
...
}
if __name__ == '__main__':
logging.config.dictConfig(LOGGING)
logging.warning('The local time is %s', time.asctime())
Here is the output. Note that in the first line, UTC time is used instead of local time, because the UTCFormatter is used.
2015-10-17 12:53:29,501 The local time is Sat Oct 17 13:53:29 2015
2015-10-17 13:53:29,501 The local time is Sat Oct 17 13:53:29 2015

Configuring python logging with a config file

Please bear with me while I try to explain my setup.
I have an application structure as follows
1.) MAIN AGENT
2.) SUPPORTING MODULES
3.) Various ClASSES called upon BY MODULES
This is a script that basically sets up my logging currently.
logging_setup.py -
Through this script I set up a custom format using context filters and other classes. A snippet of it is as follows.
class ContextFilter(logging.Filter):
CMDID_cf="IAMTEST1"
def __init__(self, CMDID1):
self.CMDID_cf=CMDID1
def filter(self,record):
record.CMDID=self.CMDID_cf
return True
class testFormatter(logging.Formatter):
def format(self,record):
record.message=record.getMessage()
if string.find(self._fmt,"%(asctime)") >= 0:
record.asctime = self.formatTime(record, self.datefmt)
if threading.currentThread().getName() in cmdidDict:
record.CMDID=cmdidDict[threading.currentThread().getName()]
else:
record.CMDID="Oda_EnvId"
return self._fmt % record.__dict__
def initLogging(loggername)
format=testFormatter(*some format*)
*other configuration settings*
So I basically have two questions both of which I believe can be used by I dont know how to implement them correctly.
1.) I want my MAIN AGENT Logger to have format and other configuration as set up by logging_setup script while I want MODULES to log messages having configuration set from a different Config File.
So in short is it possible for two modules who are logging to the same file to have different configurations set from two different sources.
P.S. I am using logger.getLogger() call to get logger in each of these modules.
2.) If the above isn't possible (or even if it is) how can i make the config file to includes complex formatting ?? i.e. how can I change the following config file so that it will set up the format in the same way as logging_Setup.py does.
My current config file is :
[loggers]
keys=root,testAgent,testModule
[formatters]
keys=generic
[handlers]
keys=fh
[logger_root]
level=DEBUG
handlers=fh
[logger_testAgent]
level=DEBUG
handlers=fh
qualname=testAgent
propagate=0
[logger_testModule]
level=ERROR
handlers=fh
qualname=testAgent.testModule.TEST
propagate=0
[handler_fh]
class=handlers.RotatingFileHandler
level=DEBUG
formatter=generic
maxBytes=1000
args=('spam.log',)
[formatter_generic]
format=%(asctime)s %(name)s %(levelname)s %(lineno)d %(message)s
I hope I have tried to make the question clear.
Thanks!!

Categories