I have a relatively complex ecosystem of applications and libraries that are scheduled to run in my environment.
I am trying to improve my logging and in particular I'd like to write debug information to a logging file, and I'd like that log to contain the logger.debug("string") lines from all the imported libraries I wrote, but not from libraries I import from pypi.
example:
import sys
import numpy
from bs4 import BeautifulSoup
import logging
import mylibrary
import myotherlibrary
logger = logging.getLogger(application_name) # I don't use _ _ name _ _ in all of them, but I can change this line as necessary
so in this case when I set logger level to debug, I'd like to see debug information from the current script, from mylibrary and from myotherlibrary , but not from bs4,numpy, etc.
bonus: Ideally I would like to not have to hardcode every time the name of the libraries, but just have the script "know" it (from naming convention maybe?)
If anyone has any ideas it'd be greatly appreciated!
Python doesn't really have a concept of "libraries I wrote" vs "libraries imported with pypi" - a library is a library unfortunately.
However, depending on how your libraries are set up, you may be able to get a realllly hacky custom logger?
By default, Python libraries installed with pip go to a central location - usually something like /usr/local/lib or %APPDATA% on windows. In contrast, local libraries are usually within the same directory as the calling script. We can use this to our advantage!
The following code demonstrates a kinda proof-of-concept - I've left a few methods needing implementing as an exercise ;)
#CustomLogger.py
import __main__
import logging
import os
#create a custom log class, inheriting the current logger class
class CustomLogger(logging.getLoggerClass()):
custom_lib = False
def __init__(self, name):
#initialise the base logger
super().__init__(name)
#get the directory we are being run from
current_dir = os.path.dirname(__main__.__file__)
permutations = ['/', '.py', '.pyc']
#check if we are a custom library, or if we are one installed via pip etc
self.custom_lib = self.checkExists(current_dir, permutations)
self.propagate = not self.custom_lib
def checkExists(self, current_dir, permutations):
#loop through each permutation and see if a file matching that spec exists
#currently looks for .py/.pyc files and directories
for perm in permutations:
file = os.path.join(current_dir, self.name + perm)
if os.path.exists(file):
return True
return False
def isEnabledFor(self, level):
if self.custom_lib:
return super().isEnabledFor(level)
return False
#the hackiest part :)
#these are two sample overrides that only log if we're a custom
#library (i.e. one we've written, not installed)
#there are a few more methods that I've not implemented, a full
#list is available at https://docs.python.org/3/library/logging.html#logging.Logger
def debug(self, msg, *args, **kwargs):
if self.custom_lib:
return super().debug(msg, args, kwargs)
def info(self, msg, *args, **kwargs):
if self.custom_lib:
return super().info(msg, args, kwargs)
#most important part - also override the logger class
#this means that any calls to logging.getLogger() will use our new subclass
logging.setLoggerClass(CustomLogger)
You could then use it like this:
import CustomLogger #needs importing first so it ensures the logger is setup
import sys
import numpy
from bs4 import BeautifulSoup
import logging
import mylibrary
import myotherlibrary
logger = logging.getLogger(application_name) #returns type CustomLogger
Related
I've been trying to figure out how best to set this up. Cutting it down as much as I can. I have 4 python files: core.py (main), logger_controler.py, config_controller.py, and a 4th as a module or singleton well just call it tool.py.
The way I have it setup is logging has an init function that setup pythons built in logging with the necessary levels, formatter, directory location, etc. I call this init function in main.
import logging
import logger_controller
def main():
logger_controller.init_log()
logger = logging.getLogger(__name__)
if __name__ == "__main__":
main()
config_controller is using configparser and is mainly a singleton as a controller for my config.
import configparser
import logging
logger = logging.getLogger(__name__)
class ConfigController(object):
def __init__(self, *file_names):
self.config_parser = configparser.ConfigParser()
found_files = self.config_parser.read(file_names)
if not found_files:
raise ValueError("No config file found.")
self._validate()
def _validate(self):
...
def read_config(self, section, field):
try:
data = self.config_parser.get(section, field)
except (configparser.NoSectionError, configparser.NoOptionError) as e:
logger.error(e)
data = None
return data
config = ConfigController("config.ini")
And then my problem is trying to create the 4th file and making sure both my logger and config parser are running before it. I'm also wanting this 4th one to be a singleton so it's following a similar format as the config_controller.
So tool.py uses config_controller to pull anything it needs from the config file. It also has some error checking for if config_controller's read_config returns None as that isn't validated in _validate. I did this as I wanted my logging to have a general layer for error checking and a more specific layer. So _validate just checks if required fields and sections are in the config file. Then wherever the field is read will handle extra error checking.
So my main problem is this:
How do I have it where my logger and configparser are both running and available before anything else. I'm very much willing to rework all of this, but I'd like to keep the functionality of it all.
One attempt I tried that works, but seems very messy is making my logger_controler a singleton that just returns python's logging object.
import logging
import os
class MyLogger(object):
def __new__(cls, *args, **kwargs):
init_log()
return logging
def init_log():
...
mylogger = MyLogger()
Then in core.py
from logger_controller import mylogger
logger = mylogger.getLogger(__name__)
I feel like there should be a better way to do the above, but I'm honestly not sure how.
A few ideas:
Would I be able to extend the logging class instead of just using that init_log function?
Maybe there's a way I can make all 3 individual modules such that they each initialize in a correct order? My attempts here didn't quite work as I also have some internal data that I wouldn't want exposed to classes using the module, just the functionality.
I'd like to have it where all 3, logging, configparsing, and the tool, available anywhere I import them.
How I have it setup now "works" but if I were to import the tool.py anywhere in core.py and an error occurs that I need to catch, then my logger won't be able to log it as this tool is loading before the init of my logger.
I was wondering what would be the best way for me to structure my logs in a special situation.
I have a series of python services that use the same python files for communicating (ex. com.py) with the HW. I have logging implemented in this modules and i would like for it to be dependent(associated) with the main service that is calling the modules.
How should i structure the logger logic so that if i have:
main_service_1->module_for_comunication
The logging goes to file main_serv_1.log
main_service_2->module_for_comunication
The logging goes to file main_serv_2.log
What would be the best practice in this case without harcoding anything?
Is there a way to know the file which is importing the com.py, so that i am able inside of the com.py, to use this information to adapt the logging to the caller?
In my experience, for a situation like this, the cleanest and easiest to implement strategy is to pass the logger to the code that does the logging.
So, create a logger for each service you want to have log to a different file, and pass that logger in to the code from your communications module. You can use __name__ to get the name of the current module (the actual module name, without the .py extension).
In the example below I implemented a fallback for the case when no logger is passed in as well.
com.py
from log import setup_logger
class Communicator(object):
def __init__(self, logger=None):
if logger is None:
logger = setup_logger(__name__)
self.log = logger
def send(self, data):
self.log.info('Sending %s bytes of data' % len(data))
svc_foo.py
from com import Communicator
from log import setup_logger
logger = setup_logger(__name__)
def foo():
c = Communicator(logger)
c.send('foo')
svc_bar.py
from com import Communicator
from log import setup_logger
logger = setup_logger(__name__)
def bar():
c = Communicator(logger)
c.send('bar')
log.py
from logging import FileHandler
import logging
def setup_logger(name):
logger = logging.getLogger(name)
handler = FileHandler('%s.log' % name)
logger.addHandler(handler)
return logger
main.py
from svc_bar import bar
from svc_foo import foo
import logging
# Add a StreamHandler for the root logger, so we get some console output in
# addition to file logging (for easy of testing). Also set the level for
# the root level to INFO so our messages don't get filtered.
logging.basicConfig(level=logging.INFO)
foo()
bar()
So, when you execute python main.py, this is what you'll get:
On the console:
INFO:svc_foo:Sending 3 bytes of data
INFO:svc_bar:Sending 3 bytes of data
And svc_foo.log and svc_bar.log each will have one line
Sending 3 bytes of data
If a client of the Communicator class uses it without passing in a logger, the log output will end up in com.log (fallback).
I see several options:
Option 1
Use __file__. __file__ is the pathname of the file from which the module was loaded (doc). depending of your structure, you should be able to identify the module by performing an os.path.split() like so:
If the folder structure is
+- module1
| +- __init__.py
| +- main.py
+- module2
+- __init__.py
+- main.py
you should be able to obtain the module name with a code placed in main.py:
def get_name():
module_name = os.path.split(__file__)[-2]
return module_name
This is not exactly DRY because you need the same code in both main.py. Reference here.
Option 2
A bit cleaner is to open 2 terminal windows and use an environment variable. E.g. you can define MOD_LOG_NAME as MOD_LOG_NAME="main_service_1" in one terminal and MOD_LOG_NAME="main_service_2" in the other one. Then, in your python code you can use something like:
import os
LOG_PATH_NAME os.environ['MOD_LOG_NAME']
This follows separation of concerns.
Update (since the question evolved a bit)
Once you've established the distinct name, all you have to do is to configure the logger:
import logging
logging.basicConfig(filename=LOG_PATH_NAME,level=logging.DEBUG)
(or get_name())and run the program.
Trying to add a few imports to my IPython profile so that when I open a kernel in the Spyder IDE they're always loaded. Spyder has a Qt interface (I think??), so I (a) checked to make sure I was in the right directory for the profile using the ipython locate command in the terminal (OSX), and (b) placing the following code in my ipython_qtconsole_config.py file:
c.IPythonQtConsoleApp.exec_lines = ["import pandas as pd",
"pd.set_option('io.hdf.default_format', 'table')",
"pd.set_option('mode.chained_assignment','raise')",
"from __future__ import division, print_function"]
But when I open a new window and type pd.__version__ I get the NameError: name 'pd' is not defined error.
Edit: I don't have any problems if I run ipython qtconsole from the Terminal.
Suggestions?
Thanks!
Whether Spyder uses a QT interface or not shouldn't be related to which of the IPython config files you want to modify. The one you chose to modify, ipython_qtconsole_config.py is the configuration file that is loaded when you launch IPython's QT console, such as with the command line command
user#system:~$ ipython qtconsole
(I needed to update pyzmq for this to work.)
If Spyder maintains a running IPython kernel and merely manages how to display that for you, then Spyder is probably just maintaining a regular IPython session, in which case you want your configuration settings to go into the file ipython_config.py at the same directory where you found ipython_qtconsole_config.py.
I manage this slightly differently than you do. Inside of ipython_config.py the top few lines for me look like this:
# Configuration file for ipython.
from os.path import join as pjoin
from IPython.utils.path import get_ipython_dir
c = get_config()
c.InteractiveShellApp.exec_files = [
pjoin(get_ipython_dir(), "profile_default", "launch.py")
]
What this does is to obtain the IPython configuration directory for me, add on the profile_default subdirectory, and then add on the name launch.py which is a file that I created just to hold anything I want to be executed/loaded upon startup.
For example, here's the first bit from my file launch.py:
"""
IPython launch script
Author: Ely M. Spears
"""
import re
import os
import abc
import sys
import mock
import time
import types
import pandas
import inspect
import cPickle
import unittest
import operator
import warnings
import datetime
import dateutil
import calendar
import copy_reg
import itertools
import contextlib
import collections
import numpy as np
import scipy as sp
import scipy.stats as st
import scipy.weave as weave
import multiprocessing as mp
from IPython.core.magic import (
Magics,
register_line_magic,
register_cell_magic,
register_line_cell_magic
)
from dateutil.relativedelta import relativedelta as drr
###########################
# Pickle/Unpickle methods #
###########################
# See explanation at:
# < http://bytes.com/topic/python/answers/
# 552476-why-cant-you-pickle-instancemethods >
def _pickle_method(method):
func_name = method.im_func.__name__
obj = method.im_self
cls = method.im_class
return _unpickle_method, (func_name, obj, cls)
def _unpickle_method(func_name, obj, cls):
for cls in cls.mro():
try:
func = cls.__dict__[func_name]
except KeyError:
pass
else:
break
return func.__get__(obj, cls)
copy_reg.pickle(types.MethodType, _pickle_method, _unpickle_method)
#############
# Utilities #
#############
def interface_methods(*methods):
"""
Class decorator that can decorate an abstract base class with method names
that must be checked in order for isinstance or issubclass to return True.
"""
def decorator(Base):
def __subclasshook__(Class, Subclass):
if Class is Base:
all_ancestor_attrs = [ancestor_class.__dict__.keys()
for ancestor_class in Subclass.__mro__]
if all(method in all_ancestor_attrs for method in methods):
return True
return NotImplemented
Base.__subclasshook__ = classmethod(__subclasshook__)
return Base
def interface(*attributes):
"""
Class decorator checking for any kind of attributes, not just methods.
Usage:
#interface(('foo', 'bar', 'baz))
class Blah
pass
Now, new classes will be treated as if they are subclasses of Blah, and
instances will be treated instances of Blah, provided they possess the
attributes 'foo', 'bar', and 'baz'.
"""
def decorator(Base):
def checker(Other):
return all(hasattr(Other, a) for a in attributes)
def __subclasshook__(cls, Other):
if checker(Other):
return True
return NotImplemented
def __instancecheck__(cls, Other):
return checker(Other)
Base.__metaclass__.__subclasshook__ = classmethod(__subclasshook__)
Base.__metaclass__.__instancecheck__ = classmethod(__instancecheck__)
return Base
return decorator
There's a lot more, probably dozens of helper functions, snippets of code I've thought are cool and just want to play with, etc. I also define some randomly generated toy data sets, like NumPy arrays and Pandas DataFrames, so that when I want to poke around with some one-off Pandas syntax or something, some toy data is always right there.
The other upside is that this factors out the custom imports, function definitions, etc. that I want loaded, so if I want the same things loaded for the notebook and/or the qt console, I can just add the same bit of code to exec the file launch.py and I can make changes in only launch.py without having to manually migrate them to each of the three configuration files.
I also uncomment a few of the different settings, especially for plain IPython and for the notebook, so the config files are meaningfully different from each other, just not based on what modules I want imported on start up.
I'm currently writing some kind of tiny api to support extending module classes. Users should be able to just write their class name in a config and it gets used in our program. The contract is, that the class' module has a function called create(**kwargs) to return an instance of our base module class, and is placed in a special folder. But the isinstance check Fails as soon as the import is made dynamically.
modules are placed in lib/services/name
module base class (in lib/services/service)
class Service:
def __init__(self, **kwargs):
#some initialization
example module class (in lib/services/ping)
class PingService(Service):
def __init__(self, **kwargs):
Service.__init__(self,**kwargs)
# uninteresting init
def create(kwargs):
return PingService(**kwargs)
importing function
import sys
from lib.services.service import Service
def doimport( clazz, modPart, kw, class_check):
path = "lib/" + modPart
sys.path.append(path)
mod = __import__(clazz)
item = mod.create(kw)
if class_check(item):
print "im happy"
return item
calling code
class_check = lambda service: isinstance(service, Service)
s = doimport("ping", "services", {},class_check)
print s
from lib.services.ping import create
pingService = create({})
if isinstance(pingService, Service):
print "why this?"
what the hell am I doing wrong
here is a small example zipped up, just extract and run test.py without arguments
zip example
The problem was in your ping.py file. I don't know exactly why, but when dinamically importing it was not accepting the line from service import Service, so you just have to change it to the relative path: from lib.services.service import Service. Adding lib/services to the sys.path could not make it work the inheritance, which I found strange...
Also, I am using imp.load_source which seems more robust:
import os, imp
def doimport( clazz, modPart, kw, class_check):
path = os.path.join('lib', modPart, clazz + '.py')
mod = imp.load_source( clazz, path )
item = mod.create(kw)
if class_check(item):
print "im happy"
return item
Either it's lack of sleep but I feel silly that I can't get this. I have a plugin, I see it get loaded but I can't instantiate it in my main file:
from transformers.FOMIBaseClass import find_plugins, register
find_plugins()
Here's my FOMIBaseClass:
from PluginBase import MountPoint
import sys
import os
class FOMIBaseClass(object):
__metaclass__ = MountPoint
def __init__(self):
pass
def init_plugins(self):
pass
def find_plugins():
plugin_dir = os.path.dirname(os.path.realpath(__file__))
plugin_files = [x[:-3] for x in os.listdir(plugin_dir) if x.endswith("Transformer.py")]
sys.path.insert(0, plugin_dir)
for plugin in plugin_files:
mod = __import__(plugin)
Here's my MountPoint:
class MountPoint(type):
def __init__(cls,name,bases,attrs):
if not hasattr(cls,'plugins'):
cls.plugins = []
else:
cls.plugins.append(cls)
I see it being loaded:
# /Users/carlos/Desktop/ws_working_folder/python/transformers/SctyDistTransformer.pyc matches /Users/carlos/Desktop/ws_working_folder/python/transformers/SctyDistTransformer.py
import SctyDistTransformer # precompiled from /Users/carlos/Desktop/ws_working_folder/python/transformers/SctyDistTransformer.pyc
But, for the life of me, I can't instantiate the 'SctyDistTransformer' module from the main file. I know I'm missing something trivial. Basically, I want to employ a class loading plugin.
To dymically load Python modules from arbitrary folders use imp module:
http://docs.python.org/library/imp.html
Specifically the code should look like:
mod = imp.load_source("MyModule", "MyModule.py")
clz = getattr(mod, "MyClassName")
Also if you are building serious plug-in architecture I recommend using Python eggs and entry points:
http://wiki.pylonshq.com/display/pylonscookbook/Using+Entry+Points+to+Write+Plugins
https://github.com/miohtama/vvv/blob/master/vvv/main.py#L104