Maybe someone here have a good idea on how to solve my issue. I have a REST API project driven by FastAPI. Every incomming request comes with a hash in the header. I am looking for a simple solution to write this hash as an extra parameter to the logs. I want to avoid adding it every time per hand. I first come up with the solution to write a Middleware which writes the hash in a Logger Object and then later use the loggerObject.log() function which adds the hash automatically. But this only works for my own log messages. Log messages from for example exceptions or from libraries I use dont have the extra parameter.
I solved a similar problem using the structlog library. It allows you to set global context variables that will be outputed to every log line.
Note that this only applies to logs outputted by structlog, special configuration is required in order to output the stdlib logging logs with these context variables.
Here is a demo FastAPI application I wrote that generates a request-id for every incoming http request, and adds the request ID to every log line outputted during that HTTP request.
In this example I have configured the stdlib logging logs to be processed by structlog, so them to have the request-id parameter added to them.
https://gitlab.com/sagbot/structlog-demo
Reference files in demo:
demo/route.py - A bunch of demo routes. note the /long route outputs both stdlib logs and structlog logs, but both of them are outputted with the request-id property
demo/configure-logging.py - This func configures both structlog and stdlib logging. Note that the "Special configuration" required to show the context variables in stdlib logs is in this func, where I added the custom formatters in logging.config.dictConfig(...) func
demo/middlewares/request_logging/middleware.py - This is the middleware that generates and adds the request-id to the logs. You can alter this middleware to not generate the value, but extract it from the hash header of the incomming request, as you wanted
Related
I am tasked with writing a Python client app (validating admissions webhook) to determine:
which of these resources have PDBs configured
["deployments", "replicasets", "statefulsets", "horizontalpodautoscalers"]
and
if they have PDB configured, verify that the PDB allows for at least one disruption ("ALLOWED DISRUPTIONS").
For the first requirement, how can I use the Python Kubernetes API to determine if the incoming request (CREATE or UPDATE) is referring to a resource that has a pdb?
The requirement is not that every CREATE/UPDATE operation has a pdb configured, but rather "if" the resource has a pbd, the request must result in the pdb having "at least one" disruption permittted.
So am I right that I must first put the incoming request thru some sort of filter that determines if the resource in question has a pdb? And if so, how can I get that filter coded?
What would psuedo-code even look like for this?
These are my first thoughts:
list all pdbs in the cluster
determine if the incoming request is referecing a resource that has a pdb
if the request is referencing a resource that has a pdb, determine whether or not the request would take the pdb down to "0" disruptions
if "yes" to previous step, do not allow the operation
This is what I have so far
from kubernetes import config, client
config.load_incluster_config()
app_client = client.AppsV1Api()
In python 2.7, the app engine sdk was doing the work in the background to nest all logs with the parent request to have a correlation in Google StackDriver.
As of the transition to python 3, it is through the usage of google cloud logging or structured logging, and from all the different references I could found, it's important to have the same trace id in the 'sub' logs for stack driver to make a match with the 'request' log.
And still as you can see below, it still appear as different logs.
For context, I even tried this on an empty django project deployed on app engine.
Got the same result, even when following the example in the documentation:
https://cloud.google.com/run/docs/logging#writing_structured_logs
Trying to log to the stdout is giving the same result.
Edit:
After the initial request, all other request will be nested under the initial request when using the stdout.
But, the highest severity of the 'child' logs is not taken by the 'parent' log, therefore the filters won't pick up the actual log. See below:
Thanks for the question!
It looks like you're logging the trace correctly, but your logName indicates that you're not using a stdout or stderr. If you use one of these for your logs, they will correlate properly, like this:
StackDriver Logs Screenshot
You can see that the logName ends with stdout. An stdout or stderr will correlate. You can create this as shown here in the tutorial:
# Build structured log messages as an object.
global_log_fields = {}
# Add log correlation to nest all log messages
# beneath request log in Log Viewer.
trace_header = request.headers.get('X-Cloud-Trace-Context')
if trace_header and PROJECT:
trace = trace_header.split('/')
global_log_fields['logging.googleapis.com/trace'] = (
f"projects/{PROJECT}/traces/{trace[0]}")
# Complete a structured log entry.
entry = dict(severity='NOTICE',
message='This is the default display field.',
# Log viewer accesses 'component' as jsonPayload.component'.
component='arbitrary-property',
**global_log_fields)
print(json.dumps(entry))
EDIT:
To filter out the stdout and only see the Request logs in the stackdriver UI, you can de-select the stdout from the filter. Logs Filter
For a sample using the python client API, please see this article and attached sample Flask app. Combining correlated Log Lines in Google Stackdriver
I was able to achieve this kind of logging structure on Google Cloud Logging Console:
I was using the Django Framework. I wrote Django middleware which integrates Google Cloud Logging API.
"Trace" needs to be added to every log object which points to its parent log object.
please check manage nesting of logs in Google Stackdriver with Django
Please check django-google-stackdriver-nested-logging log_middleware.py source on Github.
probably I don't quite understand how logging really works in Python. I'm trying to debug a Flask+SQLAlchemy (but without flask_sqlalchemy) app which mysteriously hangs on some queries only if run from within Apache, so I need to have proper logging to get meaningful information. The Flask application by default comes with a nice logger+handler, but how do I get SQLAlchemy to use the same logger?
The "Configuring Logging" section in the SQLAlchemy just explains how to turn on logging in general, but not how to "connect" SQLAlchemy's logging output to an already existing logger.
I've been looking at Flask + sqlalchemy advanced logging for a while with a blank, expressionless face. I have no idea if the answer to my question is even in there.
EDIT: Thanks to the answer given I now know that I can have two loggers use the same handler. Now of course my apache error log is littered with hundreds of lines of echoed SQL calls. I'd like to log only error messages to the httpd log and divert all lower-level stuff to a separate logfile. See the code below. However, I still get every debug message into the http log. Why?
if app.config['DEBUG']:
# Make logger accept all log levels
app.logger.setLevel(logging.DEBUG)
for h in app.logger.handlers:
# restrict logging to /var/log/httpd/error_log to errors only
h.setLevel(logging.ERROR)
if app.config['LOGFILE']:
# configure debug logging only if logfile is set
debug_handler = logging.FileHandler(app.config['LOGFILE'])
debug_handler.setLevel(logging.DEBUG)
app.logger.addHandler(debug_handler)
# get logger for SQLAlchemy
sq_log = logging.getLogger('sqlalchemy.engine')
sq_log.setLevel(logging.DEBUG)
# remove any preconfigured handlers there might be
for h in sq_log.handlers:
sq_log.removeHandler(h)
h.close()
# Now, SQLAlchemy should not have any handlers at all. Let's add one
# for the logfile
sq_log.addHandler(debug_handler)
You cannot make SQLAlchemy and Flask use the same logger, but you can make them writing to one place by add a common Handler. And maybe this article is helpful: https://www.electricmonk.nl/log/2017/08/06/understanding-pythons-logging-module/
By the way, if you want to get all logs in one single request, you can set a uniq name for current thread before request, and add the threadName in you logging's formatter.
Answer to my question at EDIT: I still had "echo=True" set on the create_engine, so what I saw was all the additional output on stderr. echo=False stops that but still logs to debug level DEBUG.
Clear all corresponding handlers created by SqlAlchemy:
logging.getLogger("sqlalchemy.engine.Engine").handlers.clear()
The code above should be called after engine created.
I simply want to receive notifications from dropbox that a change has been made. I am currently following this tutorial:
https://www.dropbox.com/developers/reference/webhooks#tutorial
The GET method is done, verification is good.
However, when trying to mimic their implementation of POST, I am struggling because of a few things:
I have no idea what redis_url means in the def_process function of the tutorial.
I can't actually verify if anything is really being sent from dropbox.
Also any advice on how I can debug? I can't print anything from my program since it has to be ran on a site rather than an IDE.
Redis is a key-value store; it's just a way to cache your data throughout your application.
For example, access token that is received after oauth callback is stored:
redis_client.hset('tokens', uid, access_token)
only to be used later in process_user:
token = redis_client.hget('tokens', uid)
(code from https://github.com/dropbox/mdwebhook/blob/master/app.py as suggested by their documentation: https://www.dropbox.com/developers/reference/webhooks#webhooks)
The same goes for per-user delta cursors that are also stored.
However there are plenty of resources how to install Redis, for example:
https://www.digitalocean.com/community/tutorials/how-to-install-and-use-redis
In this case your redis_url would be something like:
"redis://localhost:6379/"
There are also hosted solutions, e.g. http://redistogo.com/
Possible workaround would be to use database for such purpose.
As for debugging, you could use logging facility for Python, it's thread safe and capable of writing output to file stream, it should provide you with plenty information if properly used.
More info here:
https://docs.python.org/2/howto/logging.html
When setting up a Pyramid app and adding settings to the Configurator, I'm having issues understanding how to access information from request, like request.session and such. I'm completely new at using Pyramid and I've searched all over the place for information on this but found nothing.
What I want to do is access information in the request object when sending out exception emails on production. I can't access the request object, since it's not global in the __init__.py file when creating the app. This is what I've got now:
import logging
import logging.handlers
from logging import Formatter
config.include('pyramid_exclog')
logger = logging.getLogger()
gm = logging.handlers.SMTPHandler(('localhost', 25), 'email#email.com', ['email#email.com'], 'Error')
gm.setLevel(logging.ERROR)
logger.addHandler(gm)
This works fine, but I want to include information about the logged in user when sending out the exception emails, stored in session. How can I access that information from __init__.py?
Attempting to make request a global variable, or somehow store a pointer to "current" request globally (if that's what you're going to try with subscribing to NewRequest event) is not a terribly good idea - a Pyramid application can have more than one thread of execution, so more than one request can be active within a single process at the same time. So the approach may appear to work during development, when the application runs in a single thread mode and just one user accesses it, but produce really funny results when deployed to a production server.
Pyramid has pyramid.threadlocal.get_current_request() function which returns thread-local request variable, however, the docs state that:
This function should be used extremely sparingly, usually only in unit
testing code. it’s almost always usually a mistake to use
get_current_request outside a testing context because its usage makes
it possible to write code that can be neither easily tested nor
scripted.
which suggests that the whole approach is not "pyramidic" (same as pythonic, but for Pyramid :)
Possible other solutions include:
look at exlog.extra_info parameter which should include environ and params attributes of the request into the log message
registering exception views would allow completely custom processing of exceptions
Using WSGI middleware, such as WebError#error_catcher or Paste#error_catcher to send emails when an exception occurs
if you want to log not only exceptions but possibly other non-fatal information, maybe just writing a wrapper function would be enough:
if int(request.POST['donation_amount']) >= 1000000:
send_email("Wake up, we're rich!", authenticated_userid(request))