Google App Engine request style logging in a kubernetes container app

Google App Engine request style logging in a kubernetes container app - python

I need to setup logging in a custom web app which ideally would match the magic which happens when running a web app in Google app engine
For example, in GAE there is a request_log which can be viewed. This groups all log statements together under each request and each request has the http status code together with the endpoint path of the url. Here is an example (I apologise in advance for the crude editing here)
In a flask application I are deploying to Google Kubernetes Engine I would like to get the same level of logging in place. Trouble is I just do not know where to start.
I have got as far as installing the google-cloud-logging python library and have some rudimentary logging in place like this....
..but this is no where near the level I would like.
So the question is - where do I start?? Any searches / docs I have found so far have come up short.

Structured Logging
In Stackdriver Logging, structured logs refer to log entries that use the jsonPayload field to add structure to their payloads. If you use the Stackdriver Logging API or the command-line utility, gcloud logging, you can control the structure of your payloads. Here's an example of what a jsonPayload would look like:
{
insertId: "1m9mtk4g3mwilhp"
jsonPayload: {
[handler]: "/"
[method]: "GET"
[message]: "200 OK"
}
labels: {
compute.googleapis.com/resource_name: "add-structured-log-resource"
}
logName: "projects/my-sample-project-12345/logs/structured-log"
receiveTimestamp: "2018-03-21T01:53:41.118200931Z"
resource: {
labels: {
instance_id: "5351724540900470204"
project_id: "my-sample-project-12345"
zone: "us-central1-c"
}
type: "gce_instance"
}
timestamp: "2018-03-21T01:53:39.071920609Z"
}
You can set your own customizable jsonPayload with the parameters and values that you would like to obtain and then write this information to Stackdriver Logs Viewer.
Setting Debug mode to True
When setting debug=True, you will be able see your app in debugging mode. You will be able to see the HTTP requests, as they will appear on your console for debugging purposes, which you could then write these requests to Stackdriver Logs Viewer. An example of a Hello world Flask app running in Debug mode.
from flask import Flask
app = Flask(__name__)
#app.route("/")
def hello():
return "Hello World!"
if __name__ == "__main__":
app.run(port='5000', debug=True)
Which you could add a Flask logging handler as follows:
import logging
from logging.handlers import RotatingFileHandler
from flask import Flask
app = Flask(__name__)
#app.route('/')
def foo():
app.logger.warning('A warning occurred (%d apples)', 42)
app.logger.error('An error occurred')
app.logger.info('Info')
return "foo"
if __name__ == '__main__':
handler = RotatingFileHandler('foo.log', maxBytes=10000, backupCount=1)
handler.setLevel(logging.INFO)
app.logger.addHandler(handler)
app.run()
As you can see, there are ways to achieve this, by following the proper log configuration; although, the Stackdriver Logs Viewer UI will not look the same for Kubernetes logs as in App Engine Stackdriver Logs Viewer.
Additionally, you could also take a look into Combining correlated log lines in Google Stackdriver since it will give you a better idea of how to batch your logs by categories or groups in case you need to do so.

Click on "View options" at top right corner in the logs panel > "Modify Custom fields"
https://cloud.google.com/logging/docs/view/overview#custom-fields

I am writing this here letting people know what I have come up with during my investigations.
The information supplied by sllopis got me to to the closest solution - using a mixture of structured logging and refactoring some of the code in the flask-gcp-log-groups library I am able to get requests logged in Stackdriver with log lines correlated underneath
Unfortunately this solution has a few gaping holes making it far from ideal albeit it is the best I can come up with so far based on Stackdrivers rigidness.
Each time I drill into a request there is a "flash" as Stackdriver searches and grabs all the trace entries matching that request. The bigger the collection of entries, the longer the flash takes to complete.
I cannot search for text within the correlated lines when only looking at the "request" log. For example, say a correlated log entry underneath a request has a string with the text "now you see me" - if I search for the string "see" it will not bring up that request in the list of search results.
I may be missing something obvious but I have spent several very frustrating days trying to achieve something which you think should be quite simple.
Ideally I would create a protoPayload per log entry, within I would put an array under the property "line" similar to how Google App Engine does its logging.
However there does not appear to be a way of doing this as protoPayload is reserved for Audit Logs.
Thanks to sllopis for the information supplied - if I don't find a better solution soon I will mark the answer as correct as it is the closest I believe I will get to what I want to achieve.
Given the situation I am very tempted to ditch Stackdriver in favour of a better logging solution - any suggestions welcome!

Related

Use Multiple Azure Application Insights in one Flask app

Hi i have a flask application that Build as a docker image to serve as an api
this image is deployed to multiple environments (DEV/QA/PROD)
i want to use an applicationInsight for each environment
using a single application Insight works fine
here is a code snippet
app.config['APPINSIGHTS_INSTRUMENTATIONKEY'] = APPINSIGHTS_INSTRUMENTATIONKEY
appinsights = AppInsights(app)
#app.after_request
def after_request(response):
appinsights.flush()
return response
but to have multiple application i need to configure app.config with the key of the app insight
i thought of this solution which thourghs errors
here a snippet :
app = Flask(__name__)
def monitor(key):
app.config['APPINSIGHTS_INSTRUMENTATIONKEY'] = key
appinsights = AppInsights(app)
#app.after_request
def after_request(response):
appinsights.flush()
return response
#app.route("/")
def hello():
hostname = urlparse(request.base_url).hostname
print(hostname)
if hostname == "dev url":
print('Dev')
monitor('3ed57a90-********')
if hostname == "prod url":
print('prod')
monitor('941caeca-********-******')
return "hello"
this example contains the function monitor which reads the url and decide which app key to give so it can send metrics to the right place but apparently i can't do those processes after the request is sent (is there a way a config variable can be changed based on the url condition ?)
error Message :
AssertionError: The setup method 'errorhandler' can no longer be called on the application. It has already handled its first request, any changes will not be applied consistently. Make sure all imports, decorators, functions, etc. needed to set up the application are done before running it.
i hope someone can guide me to a better solution
thanks in advance

AFAIK, Normally Application Insights SDK collect the telemetry data, and it has sent to Azure by batch. So, you have to keep a single application insights resource for an application. Use staging for use different application insights for same application.
When the request started for the service till to complete his response the Application insights has taking care of the specific service life cycle. The application while start to end it will track the information. So that we can't use more than one Application Insights in single application.
When Application starts the AI start collecting telemetry data when Application stops then the AI stops gathering telemetry information. We were using Flush to even though in between application stops to send information to AI.
I have tried what you have used. It confirms the same in the log
I have tried with single application insights I can be able to collect all telemetry information.

Google StackDriver correlating logs with parent request python 3

In python 2.7, the app engine sdk was doing the work in the background to nest all logs with the parent request to have a correlation in Google StackDriver.
As of the transition to python 3, it is through the usage of google cloud logging or structured logging, and from all the different references I could found, it's important to have the same trace id in the 'sub' logs for stack driver to make a match with the 'request' log.
And still as you can see below, it still appear as different logs.
For context, I even tried this on an empty django project deployed on app engine.
Got the same result, even when following the example in the documentation:
https://cloud.google.com/run/docs/logging#writing_structured_logs
Trying to log to the stdout is giving the same result.
Edit:
After the initial request, all other request will be nested under the initial request when using the stdout.
But, the highest severity of the 'child' logs is not taken by the 'parent' log, therefore the filters won't pick up the actual log. See below:

Thanks for the question!
It looks like you're logging the trace correctly, but your logName indicates that you're not using a stdout or stderr. If you use one of these for your logs, they will correlate properly, like this:
StackDriver Logs Screenshot
You can see that the logName ends with stdout. An stdout or stderr will correlate. You can create this as shown here in the tutorial:
# Build structured log messages as an object.
global_log_fields = {}
# Add log correlation to nest all log messages
# beneath request log in Log Viewer.
trace_header = request.headers.get('X-Cloud-Trace-Context')
if trace_header and PROJECT:
trace = trace_header.split('/')
global_log_fields['logging.googleapis.com/trace'] = (
f"projects/{PROJECT}/traces/{trace[0]}")
# Complete a structured log entry.
entry = dict(severity='NOTICE',
message='This is the default display field.',
# Log viewer accesses 'component' as jsonPayload.component'.
component='arbitrary-property',
**global_log_fields)
print(json.dumps(entry))
EDIT:
To filter out the stdout and only see the Request logs in the stackdriver UI, you can de-select the stdout from the filter. Logs Filter
For a sample using the python client API, please see this article and attached sample Flask app. Combining correlated Log Lines in Google Stackdriver

I was able to achieve this kind of logging structure on Google Cloud Logging Console:
I was using the Django Framework. I wrote Django middleware which integrates Google Cloud Logging API.
"Trace" needs to be added to every log object which points to its parent log object.
please check manage nesting of logs in Google Stackdriver with Django
Please check django-google-stackdriver-nested-logging log_middleware.py source on Github.

How to make SQLAlchemy and Flask use the same logger?

probably I don't quite understand how logging really works in Python. I'm trying to debug a Flask+SQLAlchemy (but without flask_sqlalchemy) app which mysteriously hangs on some queries only if run from within Apache, so I need to have proper logging to get meaningful information. The Flask application by default comes with a nice logger+handler, but how do I get SQLAlchemy to use the same logger?
The "Configuring Logging" section in the SQLAlchemy just explains how to turn on logging in general, but not how to "connect" SQLAlchemy's logging output to an already existing logger.
I've been looking at Flask + sqlalchemy advanced logging for a while with a blank, expressionless face. I have no idea if the answer to my question is even in there.
EDIT: Thanks to the answer given I now know that I can have two loggers use the same handler. Now of course my apache error log is littered with hundreds of lines of echoed SQL calls. I'd like to log only error messages to the httpd log and divert all lower-level stuff to a separate logfile. See the code below. However, I still get every debug message into the http log. Why?
if app.config['DEBUG']:
# Make logger accept all log levels
app.logger.setLevel(logging.DEBUG)
for h in app.logger.handlers:
# restrict logging to /var/log/httpd/error_log to errors only
h.setLevel(logging.ERROR)
if app.config['LOGFILE']:
# configure debug logging only if logfile is set
debug_handler = logging.FileHandler(app.config['LOGFILE'])
debug_handler.setLevel(logging.DEBUG)
app.logger.addHandler(debug_handler)
# get logger for SQLAlchemy
sq_log = logging.getLogger('sqlalchemy.engine')
sq_log.setLevel(logging.DEBUG)
# remove any preconfigured handlers there might be
for h in sq_log.handlers:
sq_log.removeHandler(h)
h.close()
# Now, SQLAlchemy should not have any handlers at all. Let's add one
# for the logfile
sq_log.addHandler(debug_handler)

You cannot make SQLAlchemy and Flask use the same logger, but you can make them writing to one place by add a common Handler. And maybe this article is helpful: https://www.electricmonk.nl/log/2017/08/06/understanding-pythons-logging-module/
By the way, if you want to get all logs in one single request, you can set a uniq name for current thread before request, and add the threadName in you logging's formatter.

Answer to my question at EDIT: I still had "echo=True" set on the create_engine, so what I saw was all the additional output on stderr. echo=False stops that but still logs to debug level DEBUG.

Clear all corresponding handlers created by SqlAlchemy:
logging.getLogger("sqlalchemy.engine.Engine").handlers.clear()
The code above should be called after engine created.

Logging with command line waitress-serve

Is there a way to log waitress-serve output into a file?
The current command I use is:
waitress-serve --listen=localhost:8080 --threads=1 my_app_api:app
The application we used was not written with waitress in mind earlier, so we choose to serve it with command line to avoid change (for now at least).

TLDR waitress-serve doesn't provide a way to do it. See the 'how do i get it to log' section.
Background
Per the documentation for the command-line usage of waitress-serve, no - there's no way to setup logging. See arguments docs.
waitress-serve is just an executable to make running your server more convenient. It's source-code is here runner.py. If you read it, you can see it actually is basically just calling from waitress import serve; serve(**args) for you. (That code clip is not literally what it's doing, but in spirit yes).
The documentation for waitress says that it doesn't log http traffic. That's not it's job. But it will log it's own errors or stacktraces. logging docs. If you read the waitress source trying to find when it logs stuff, you'll notice it doesn't seem to log http traffic anywhere github log search. It primarily logs stuff to do with the socket layer.
Waitress does say that if you want to log http traffic, then you need another component. In particular, it points you to pastedeploy docs which is some middle-ware that can log http traffic for you.
The documentation from waitress is actually kind of helpful answering you question, though not direct and explicit. It says
The WSGI design is modular.
per the logging doc
I.e. waitress won't log http traffic for you. You'll need another WSGI component to do that, and because WSGI is modular, you can probably choose a few things.
If you want some background on how this works, there's a pretty good post here leftasexercise.com
OK, how do I get it to log?
Use tee
Basically, if you just want to log the same stuff that is output from waitress-serve then you don't need anything special.
waitress-serve --listen=localhost:8080 --threads=1 my_app_api:app | tee -a waitress-serve.log
Python logging
But if you're actually looking for logging coming from python's standard logger (say you app is making logger calls or you want to log http traffic) then, you can set that up in your python application code. E.g. edit your applications soure-code and get it to setup logging to a file
import logging
logging.basicConfig(filename='app.log', encoding='utf-8', level=logging.DEBUG)
PasteDeploy middleware for http logs
Or if your looking for apache type http logging then you can use something like PasteDeploy to do it. Note, PasteDeploy is another python dependency so you'll need to install it. E.g.
pip install PasteDeploy
Then you need to setup a .ini file that tells PasteDeploy how to start your server and then also tell it to use TransLogger to create apache type http logs. This is explained more detail here logging with pastedeploy The ini file is specific to each app, but from your question is sounds like the ini file should look like:
[app:wsgiapp]
use = my_app_api:app
[server:main]
use = egg:waitress#main
host = 127.0.0.1
port = 8080
[filter:translogger]
use = egg:Paste#translogger
setup_console_handler = False
[pipeline:main]
pipeline = translogger
app
You'll still need to edit the source-code of your app to get PasteDeploy to load the app with your configuration file:
from paste.deploy import loadapp
wsgi_app = loadapp('config:/path/to/config.ini')
Webframework-dependent roll-your-own http logging
Even if you want to log http traffic, you don't necessarily need something like PasteDeploy. For example, if you are using flask as the web-framework, you can write your own http logs using after_request decorator:
#app.after_request
def after_request(response):
timestamp = strftime('[%Y-%b-%d %H:%M]')
logger.error('%s %s %s %s %s %s', timestamp, request.remote_addr, request.method, request.scheme, request.full_path, response.status)
return response
See the full gist at https://gist.github.com/alexaleluia12/e40f1dfa4ce598c2e958611f67d28966

How to change logging level in google app engine python production server?

I have tried to search answers for this question online, but in vain. I do see the answer for "How do set the log level in google app engine python dev server", which is useful to know - but if I understand correctly, this doesn't automatically translate to the production environment, right?
Deploying my dev code with hundreds of logging.debug() statements always makes them show up on the production server logs. Is there a simple switch I could flip on the production server to set the logging level and avoid clogging the logs with all debug messages? At least from looking at the Google App Engine's admin console, I haven't found there is a way to do this. This is hard to believe because one would think that App Engine developers would have provisioned a super simple way to do this.

As Paul Collingwood said in his comment, it is easy to set a filter in the Developer Console, in order to reduce visual clutter.
If there are cases in which you do not wish to have the debug logs recorded at all (e.g. while in production), you might like to write a little wrapper function for the logging calls, which checks whether the app is in dev or production, and based on that decides whether to write something to log.
I'm thinking something like:
import logging
class Logger()
def debug(*args, **kwargs):
if not running_in_production(): # needs to be implemented elsewhere.
logging.debug(*args, **kwargs)
def info(*args, **kwargs):
""" any rules here? """
logging.info(*args, **kwargs)
# other functions here.
Then, in your individual files, you could replace import logging with import logger as logging for a drop-in replacement which is adaptable to the environment where it is running - and to any other imaginable factors.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.