Python make a logger available for all classes - python

So I'm aware of the logging.getlogger(_name) that returns a logging object.
However, now I have created a semi-complex logging scheme with QueueHandler, QueueListener and HTTPHandler (this is custom_logging.py):
class CustomHttpHandler(logging.Handler):
def __init__(self, url: str, token: str, silent: bool = True):
"""
Initializes the custom http handler
Parameters:
url (str): The URL that the logs will be sent to
token (str): The Authorization token being used
silent (bool): If False the http response and logs will be sent
to STDOUT for debug
"""
self.url = url
self.token = token
self.silent = silent
# sets up a session with the server
self.MAX_POOLSIZE = 100
self.session = session = requests.Session()
"""session.headers.update({
"Content-Type": "application/json",
"Authorization": "Bearer %s" % (self.token)
})"""
self.session.mount("https://", HTTPAdapter(
max_retries=Retry(
total=5,
backoff_factor=0.5,
status_forcelist=[403, 500]
),
pool_connections=self.MAX_POOLSIZE,
pool_maxsize=self.MAX_POOLSIZE
))
super().__init__()
def emit(self, record):
"""
This function gets called when a log event gets emitted. It recieves a
record, formats it and sends it to the url
Parameters:
record: a log record
"""
logEntry = self.format(record)
logEntry=json.loads(logEntry)
response = self.session.post(self.url, json=logEntry)
if not self.silent:
print(logEntry)
print(response.content)
def init():
logger = logging.getLogger(__name__)
console_handler = logging.StreamHandler()
formatter = logging.Formatter("%(levelname)s: %(message)s")
console_handler.setFormatter(formatter)
# create a custom http logger handler
httpHandler = CustomHttpHandler(
url="https://someAddress/log",
token="1234",
silent=True
)
MACAddr=uuid.getnode()
# create formatter - this formats the log messages accordingly
formatter = logging.Formatter(json.dumps({
"time": "%(asctime)s",
"MAC Node": MACAddr,
"line": "%(lineno)d",
"module":"%(name)s",
"logLevel": "%(levelname)s",
"message": "%(message)s"
}))
# add formatter to custom http handler
httpHandler.setFormatter(formatter)
log_queue = queue.Queue(-1)
queue_handler = QueueHandler(log_queue)
logger.addHandler(queue_handler)
listener = QueueListener(log_queue, console_handler, httpHandler)
listener.start()
logger.setLevel(logging.DEBUG)
logger.info("Hello world!")
This works perfectly. I can see on the HTTP server all the logging info.
However, what if I want to use that same logger across different classes or files?
From my main.py I call the init() method:
custom_Logging.init()
Then how could I obtain all the structure created in that method?. The first idea that comes to mind is to return the logger object on the init() method, this method indeed works. But I don't know if it is the most elegant way to get around this problem.

You don't need to return a logger or pass it around, as long as you know its name (the module name - via __name__ - in the above case). You can get that logger from anywhere just using e.g. logging.getLogger('custom_Logging').
Or, if your use case allows, attach those handlers to the root logger and let other loggers use those handlers automatically. (By default, handlers of ancestor loggers are offered the chance to handle events logged to descendant loggers.)

Related

Python - Flask API - How to correlate logs in Azure ApplicationInsights when a client app sends in Traceparent header?

I have a python client application that calls (http) a python Flask api.
Both of these applications are logging to Azure Application insights using the opencensus libraries.
I want to do the logging in a fashion so that I can correlate the logs in ApplicationInsights end to end.
Python client app
For example, when the client app initiates an HTTP GET call to the Flask API, it generates an http request dependency log entry in ApplicationInsights.
The app also logs individual entries about the http request and http response into the trace table.
Flask API
I am logging the incoming HTTP request in the Flask API using request decorator, and also logging the HTTP response using a request decorator.
Also the actual method ( that the Flask routing invokes ) has its own logging.
Note These are logs go into trace table.
Expectation
I am trying to get the logs generated from the Flask API have a correlation with the log generated from the client application.
Current behaviour
Logs of Python client app
The logs in the dependency table have a operation_Id - All good!
The logs in the trace table have the same operation_Id and operation_ParentId as above - All good!
Logs of Flask api
The logs in the request table have the same operation_Id as above - All good!
The logs in the trace table generated by the before_request, after_request decorators - The operation_Id and operation_ParentId are blank. - Problematic!
The logs in the trace table generated by the logging statements inside the route/methods - The operation_Id and operation_ParentId are blank. - Problematic!
Help please
I can see that Traceparent http header is coming in as part of the http request in the Flask API, but looks like logging is ignoring this.
How do I get the logging statements to use the Traceparent data so that operation_Id and operation_ParentId show up correctly in the traces table for the Flask API?
Flask API Code
import flask
from flask import request, jsonify
import logging
import json
import requests
from opencensus.ext.azure.log_exporter import AzureLogHandler,AzureEventHandler
from opencensus.ext.flask.flask_middleware import FlaskMiddleware
from opencensus.ext.azure.trace_exporter import AzureExporter
from opencensus.trace.samplers import ProbabilitySampler, AlwaysOnSampler
from opencensus.trace.tracer import Tracer
from opencensus.trace import config_integration
import os
logger = logging.getLogger()
class MyJSONEncoder(flask.json.JSONEncoder):
def default(self, obj):
if isinstance(obj, decimal.Decimal):
# Convert decimal instances to strings.
return str(obj)
if isinstance(obj, datetime.datetime):
return obj.strftime(strftime_iso_regular_format_str)
return super(MyJSONEncoder, self).default(obj)
# Initialize logging with Azure Application Insights
class CustomDimensionsFilter(logging.Filter):
"""Add custom-dimensions like run_id in each log by using filters."""
def __init__(self, custom_dimensions=None):
"""Initialize CustomDimensionsFilter."""
self.custom_dimensions = custom_dimensions or {}
def filter(self, record):
"""Add the default custom_dimensions into the current log record."""
dim = {**self.custom_dimensions, **
getattr(record, "custom_dimensions", {})}
record.custom_dimensions = dim
return True
APPLICATION_INSIGHTS_CONNECTIONSTRING=os.getenv('APPLICATION_INSIGHTS_CONNECTIONSTRING')
modulename='FlaskAPI'
APPLICATION_NAME='FlaskAPI'
ENVIRONMENT='Development'
def callback_function(envelope):
envelope.tags['ai.cloud.role'] = APPLICATION_NAME
return True
logger = logging.getLogger(__name__)
log_handler = AzureLogHandler(
connection_string=APPLICATION_INSIGHTS_CONNECTIONSTRING)
log_handler.addFilter(CustomDimensionsFilter(
{
'ApplicationName': APPLICATION_NAME,
'Environment': ENVIRONMENT
}))
log_handler.add_telemetry_processor(callback_function)
logger.addHandler(log_handler)
azureExporter = AzureExporter(
connection_string=APPLICATION_INSIGHTS_CONNECTIONSTRING)
azureExporter.add_telemetry_processor(callback_function)
tracer = Tracer(exporter=azureExporter, sampler=AlwaysOnSampler())
app = flask.Flask("app")
app.json_encoder = MyJSONEncoder
app.config["DEBUG"] = True
middleware = FlaskMiddleware(
app,
exporter=azureExporter,
sampler=ProbabilitySampler(rate=1.0),
)
config_integration.trace_integrations(['logging', 'requests'])
def getJsonFromRequestBody(request):
isContentTypeJson = request.headers.get('Content-Type') == 'application/json'
doesHaveBodyJson = False
if isContentTypeJson:
try:
doesHaveBodyJson = request.get_json() != None
except:
doesHaveBodyJson = False
if doesHaveBodyJson == True:
return json.dumps(request.get_json())
else:
return None
def get_properties_for_customDimensions_from_request(request):
values = ''
if len(request.values) == 0:
values += '(None)'
for key in request.values:
values += key + ': ' + request.values[key] + ', '
properties = {'custom_dimensions':
{
'request_method': request.method,
'request_url': request.url,
'values': values,
'body': getJsonFromRequestBody(request)
}}
return properties
def get_properties_for_customDimensions_from_response(request,response):
request_properties = request_properties = get_properties_for_customDimensions_from_request(request)
request_customDimensions = request_properties.get('custom_dimensions')
response_properties = {'custom_dimensions':
{
**request_customDimensions,
'response_status':response.status,
'response_body':response.data.decode('utf-8')
}
}
return response_properties
# Useful debugging interceptor to log all values posted to the endpoint
#app.before_request
def before():
properties = get_properties_for_customDimensions_from_request(request)
logger.warning("request {} {}".format(
request.method, request.url), extra=properties)
# Useful debugging interceptor to log all endpoint responses
#app.after_request
def after(response):
response_properties = get_properties_for_customDimensions_from_response(request,response)
logger.warning("response: {}".format(
response.status
),extra=response_properties)
return response
#app.route('/api/{}/status'.format("v1"), methods=['GET'])
def health_check():
message = "Health ok!"
logger.info(message)
return message
if __name__ == '__main__':
app.run()
References used
Microsoft's guidance on Application Insights Log Correlation
My code repository where I have tested and reproduced the problem

How to format a log record inside ```emit``` function instead of creating a formatter object?

I wrote a custom logging handler to send a log record to a http point.
class CustomHttpHandler(logging.Handler):
def __init__(self, url: str):
# url is the endpoint to send log
self.url = url
self.session = requests.Session()
def emit(self, record):
self.session.post(self.url, data=record)
import logging
from handler import CustomHttpHandler
logger = logging.getLogger(__name__)
formatter = logging.Formatter(json.dumps({
'time': '%(asctime)s',
message': '%(message)s'
}))
handler = CustomHandler("http://myurl")
logger.addHandler(chandler)
logger.setFormatter(formatter)
Is there any way that I can format a log record inside emit function instead of creating formatter object and pass it with setFormatter function?
I don't see a problem using setFormatter, you just need to use it in your emit function like so
def emit(self, record):
print(self.format(record))
Also note that you can apply your formater only to this specific handler if you want:
handler.setFormatter(formatter)

How to decorate a Python process in order to capture HTTP requests and responses?

I'm currently investigating a very large Python codebase with lots of side effects and unexpected behavior, and I'd like to get a grasp on what it's doing by seeing all of the outbound HTTP requests that it makes throughout its execution, at any point in the call stack. Are there any utilities or integration paths that allow me to automatically profile the complete set of network calls made by code written in Python?
Specifically, as opposed to a solely external tool, I would like to be able to interact with the captured HTTP requests and responses programmatically from within the profiled module or an adjacent module; for example to:
Log the requests/responses using existing logging handlers within that codebase
Publish the request to an event broker like Kafka
Parse into a pandas dataframe for analysis
Integrate into existing unittest or pytest suites
I've looked at the offerings of different observability tools. For instance, Sentry appears to automatically integrate with Python's httplib to create a "breadcrumb" for each request; however Sentry only records this information when an exception is being thrown, and its default behavior is only to publish to its Web UI. New Relic also offers the ability to view "external service" calls as part of its application performance monitoring offerings, again through its own dashboard. In both cases, however, they each lack an officially-supported Python handler that would permit the tasks described above to occur within the process that generates the outbound network requests.
I looked at Sentry's Python SDK source code, to see how they integrated with http.client, and adapted their approach in a way that generalizes to meet my needs.
Here is the code that I wrote to decorate the http.client.HTTPConnection object to tap into requests, request bodies, and response objects. This particular example appends the data I'd like to collect to global lists that live under the profiling module, as well as logging that same data to standard out. You can easily substitute whatever bespoke functionality you'd like in place of those calls to list.append and logger.info:
import logging
import sys
from http.client import HTTPConnection
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
handler = logging.StreamHandler(stream=sys.stdout)
formatter = logging.Formatter(fmt="%(name)s %(funcName)s %(levelname)s: %(message)s")
handler.setFormatter(formatter)
logger.addHandler(handler)
put_request_content = []
get_response_content = []
request_bodies = []
def decorate_HTTPConnection():
"""Taken loosely from https://github.com/getsentry/sentry-python/blob/master/sentry_sdk/integrations/stdlib.py"""
global put_request_content, get_response_content, request_bodies
real_putrequest = HTTPConnection.putrequest
real_getresponse = HTTPConnection.getresponse
real__send_output = HTTPConnection._send_output
def new_putrequest(self, method, url, skip_host=False, skip_accept_encoding=False):
logger.info(f'{method}: {url}')
put_request_content.append((method, url))
real_putrequest(self, method, url, skip_host=skip_host, skip_accept_encoding=skip_accept_encoding)
def new_getresponse(self):
returned_response = real_getresponse(self)
logger.info(returned_response)
get_response_content.append(returned_response)
return returned_response
def new__send_output(self, message_body=None, encode_chunked=False):
logger.info(f'Message body: {message_body}')
request_bodies.append(message_body)
real__send_output(self, message_body=message_body, encode_chunked=encode_chunked)
HTTPConnection.putrequest = new_putrequest
HTTPConnection.getresponse = new_getresponse
HTTPConnection._send_output = new__send_output
decorate_HTTPConnection()
Here is a very simple script I used to test its behavior:
import logging
import sys
import requests
from http_profiler.connection_decorator import put_request_content, get_response_content, request_bodies
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
handler = logging.StreamHandler(stream=sys.stdout)
formatter = logging.Formatter(fmt="%(name)s %(funcName)s %(levelname)s: %(message)s")
handler.setFormatter(formatter)
logger.addHandler(handler)
def test_profile_http_get_via_requests_library(url):
prev_len_put_request_content = len(put_request_content)
prev_len_get_repsonse_content = len(get_response_content)
prev_len_request_bodies = len(request_bodies)
logger.info(f"Starting the test: GET {url}")
resp = requests.get(url=url)
assert resp is not None
assert len(put_request_content) - prev_len_put_request_content == 1
assert len(get_response_content) - prev_len_get_repsonse_content == 1
assert len(request_bodies) - prev_len_request_bodies == 1
def test_profile_http_post_via_requests_library(url, data=None):
if data is None:
data = {"message": "Hello world!"}
prev_len_put_request_content = len(put_request_content)
prev_len_get_repsonse_content = len(get_response_content)
prev_len_request_bodies = len(request_bodies)
logger.info(f"Starting the test: POST {url} with {data}")
resp = requests.post(url=url, data=data)
assert resp is not None
assert len(put_request_content) - prev_len_put_request_content == 1
assert len(get_response_content) - prev_len_get_repsonse_content == 1
assert len(request_bodies) - prev_len_request_bodies == 1
if __name__ == "__main__":
test_profile_http_get_via_requests_library("https://example.com")
test_profile_http_post_via_requests_library("https://example.com")
logger.info(f'Requests: {put_request_content}')
logger.info(f'Request bodies: {request_bodies}')
logger.info(f'Responses: {[f"{response.status} {response.reason}" for response in get_response_content]}')
And here is the output of the testing script:
__main__ test_profile_http_get_via_requests_library INFO: Starting the test: GET https://example.com
http_profiler.connection_decorator new_putrequest INFO: GET: /
http_profiler.connection_decorator new__send_output INFO: Message body: None
http_profiler.connection_decorator new_getresponse INFO: <http.client.HTTPResponse object at 0x7ff40aa5df10>
__main__ test_profile_http_post_via_requests_library INFO: Starting the test: POST https://example.com with {'message': 'Hello world!'}
http_profiler.connection_decorator new_putrequest INFO: POST: /
http_profiler.connection_decorator new__send_output INFO: Message body: b'message=Hello+world%21'
http_profiler.connection_decorator new_getresponse INFO: <http.client.HTTPResponse object at 0x7ff40aa5deb0>
__main__ <module> INFO: Requests: [('GET', '/'), ('POST', '/')]
__main__ <module> INFO: Request bodies: [None, b'message=Hello+world%21']
__main__ <module> INFO: Responses: ['200 OK', '200 OK']

Python3 : Records not getting pushed to Splunk

I have created a custom class, which push my logs to splunk, but somehow it is not working. Here is the class.
class Splunk(logging.StreamHandler):
def __init__(self, url, token):
super().__init__()
self.url = url
self.headers = {f'Authorization': f'Splunk {token}'}
self.propagate = False
def emit(self, record):
mydata = dict()
mydata['sourcetype'] = 'mysourcetype'
mydata['event'] = record.__dict__
response = requests.post(self.url, data=json.dumps(mydata), headers=self.headers)
return response
I call the class from my logger class, somehow like this (adding additional handler), so that it can log on console along with send to splunk
if splunk_config is not None:
splunk_handler = Splunk(splunk_config["url"], splunk_config["token"])
self.default_logger.addHandler(splunk_handler)
But somehow, I am not able to see any logs in splunk. Though I can see the logs in console.
When I try to run the strip down version of above logic from python3 terminal, it is successful.
import requests
import json
url = 'myurl'
token = 'mytoken'
headers = {'Authorization': 'Splunk mytoken'}
propagate = False
mydata = dict()
mydata['sourcetype'] = 'mysourcetype'
mydata['event'] = {'name': 'root', 'msg': 'this is a sample message'}
response = requests.post(url, data=json.dumps(mydata), headers=headers)
print(response.text)
Things I have already tried, making my dictionary data as JSON serializable using below link but it didn't helped.
https://pynative.com/make-python-class-json-serializable/
Any other things to try ?
I've successfully used this Python Class for Sending Events to Splunk HTTP Event Collector instead of writing a dedicated class
https://github.com/georgestarcher/Splunk-Class-httpevent
Advantage is that it implements batchEvent() and flushBatch() methods to submit multiple events at once across multiple threads.
The example here should get you started:
https://github.com/georgestarcher/Splunk-Class-httpevent/blob/master/example.py
If this answers your question, take a moment to accept the answer. This can be done by clicking on the check mark beside the answer to toggle it from greyed out to filled in!

How to set up HTTPHandler for python logging

I'm trying to use HTTPHandler class of standard python logging library to send logs. I need to make a https post request with basic credentials(username and password). This is how i'm setting up the HTTPHandler-
host = 'example.com'
url = '/path'
handler = logging.handlers.HTTPHandler(host, url, method='POST', secure=True, credentials=('username','password'), context=None)
logger.addHandler(handler)
But the problem is, I'm not getting anylogs in my remote server.I'm not even seeing any exception from the handler. Am I setting up the handler arguments incorrectly? I can send similar logs using simple pythong http request-
url = 'https://username:password#example.com/path'
headers = {'content-type': 'application/json'}
jsonLog = { 'id': '4444','level': 'info', 'message': 'python log' };
r = requests.post(url, data = json.dumps(jsonLog), headers=headers)
Do i need to setup header somehow because of json content-type? If yes than how do i set that up in the httphandler?
Update
I thought I should update what I ended up doing. After numerous search i found i can create a custom handler by overriding emit() of logging.Handler.
class CustomHandler(logging.Handler):
def emit(self, record):
log_entry = self.format(record)
# some code....
url = 'url'
# some code....
return requests.post(url, log_entry, headers={"Content-type": "application/json"}).content
Feel free to post if any has any better suggestions.
Expanding on the solution saz gave, here's how add a custom HTTP handler that
will forward the logs emitted to the specified URL using a bearer token.
It uses a requests session instead of having to establish a new session every log
event.
Furthermore, if the request fails it attempts to resend the logs for a given number of retries.
Note: make sure your logging handler is as simple as possible to prevent the application from halting because of a log event.
I tested it with a simple localhost echo server and it works.
Feel free to suggest any changes.
import json
import logging
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
class CustomHttpHandler(logging.Handler):
def __init__(self, url: str, token: str, silent: bool = True):
'''
Initializes the custom http handler
Parameters:
url (str): The URL that the logs will be sent to
token (str): The Authorization token being used
silent (bool): If False the http response and logs will be sent
to STDOUT for debug
'''
self.url = url
self.token = token
self.silent = silent
# sets up a session with the server
self.MAX_POOLSIZE = 100
self.session = session = requests.Session()
session.headers.update({
'Content-Type': 'application/json',
'Authorization': 'Bearer %s' % (self.token)
})
self.session.mount('https://', HTTPAdapter(
max_retries=Retry(
total=5,
backoff_factor=0.5,
status_forcelist=[403, 500]
),
pool_connections=self.MAX_POOLSIZE,
pool_maxsize=self.MAX_POOLSIZE
))
super().__init__()
def emit(self, record):
'''
This function gets called when a log event gets emitted. It recieves a
record, formats it and sends it to the url
Parameters:
record: a log record
'''
logEntry = self.format(record)
response = self.session.post(self.url, data=logEntry)
if not self.silent:
print(logEntry)
print(response.content)
# create logger
log = logging.getLogger('')
log.setLevel(logging.INFO)
# create formatter - this formats the log messages accordingly
formatter = logging.Formatter(json.dumps({
'time': '%(asctime)s',
'pathname': '%(pathname)s',
'line': '%(lineno)d',
'logLevel': '%(levelname)s',
'message': '%(message)s'
}))
# create a custom http logger handler
httpHandler = CustomHttpHandler(
url='<YOUR_URL>',
token='<YOUR_TOKEN>',
silent=False
)
httpHandler.setLevel(logging.INFO)
# add formatter to custom http handler
httpHandler.setFormatter(formatter)
# add handler to logger
log.addHandler(httpHandler)
log.info('Hello world!')
You will need to subclass HTTPHandler and override the emit() method to do what you need. You can use the current implementation of HTTPHandler.emit() as a guide.
following up to istvan, you can use threads to prevent slowing down the program
import asyncio
import concurrent.futures
executor = concurrent.futures.ThreadPoolExecutor(max_workers=10)
import time
import json
import logging
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
class CustomHttpHandler(logging.Handler):
def __init__(self, url: str, token: str, silent: bool = True):
'''
Initializes the custom http handler
Parameters:
url (str): The URL that the logs will be sent to
token (str): The Authorization token being used
silent (bool): If False the http response and logs will be sent
to STDOUT for debug
'''
self.url = url
self.token = token
self.silent = silent
# sets up a session with the server
self.MAX_POOLSIZE = 100
self.session = session = requests.Session()
session.headers.update({
'Content-Type': 'application/json',
'Authorization': 'Bearer %s' % (self.token)
})
self.session.mount('https://', HTTPAdapter(
max_retries=Retry(
total=5,
backoff_factor=0.5,
status_forcelist=[403, 500]
),
pool_connections=self.MAX_POOLSIZE,
pool_maxsize=self.MAX_POOLSIZE
))
super().__init__()
def emit(self, record):
'''
This function gets called when a log event gets emitted. It recieves a
record, formats it and sends it to the url
Parameters:
record: a log record
'''
executor.submit(actual_emit, self, record)
def actual_emit(self, record):
logEntry = self.format(record)
response = self.session.post(self.url, data=logEntry)
print(response)
if not self.silent:
print(logEntry)
print(response.content)
# create logger
log = logging.getLogger('test')
log.setLevel(logging.INFO)
# create formatter - this formats the log messages accordingly
formatter = logging.Formatter(json.dumps({
'time': '%(asctime)s',
'pathname': '%(pathname)s',
'line': '%(lineno)d',
'logLevel': '%(levelname)s',
'message': '%(message)s'
}))
# create a custom http logger handler
httpHandler = CustomHttpHandler(
url='<URL>',
token='<YOUR_TOKEN>',
silent=False
)
httpHandler.setLevel(logging.INFO)
log.addHandler(httpHandler)
def main():
print("start")
log.error("\nstop")
print("now")
if __name__ == "__main__":
main()
what this program does is send the logs to the threadpoolexecutor, with 10 max threads, if there are more logs then the threads can handle, it should queue up, this prevents slowdowns of the program.
What you can also do, atleast what I am doing on my project of making a local host logging central database and viewer, I make a seperate thread on the serverside, and then instantly return a HTTP response to make it so all the database stuff happens after the HTTP resonse has been send back. This removes the need for threads on client, seen it is on localhost and then latancy is almost 0

Categories