Capture log output from subprocesses

Capture log output from subprocesses - python

So let's say I've got a piece of code that looks like this:
main.py
def get_stdout():
sys.stdout = open(str(os.getpid()) + ".out", "w")
foo.foo()
p = Process(target=get_stdout)
p.start()
foo.py
def foo():
my_logger.info('LOG INFO HERE')
my_logger = logging.getLogger()
my_logger.setLevel(logging.DEBUG)
logHandler = logging.StreamHandler()
logHandler.setFormatter(logging.Formatter('LOG: - %(asctime)s - %(name)s - %(levelname)s - %(message)s'))
my_logger.addHandler(logHandler)
The logger is defined at the bottom the foo module. When I call python main.py, the intention is to spawn a subprocess that calls foo() from the foo module and capture its log output and write it to a file. This example doesn't work because the output stream of the logger object is defined when the module is first initialized, so it just gets written to terminal and not to the file.
What's the best way to get around this? Right now, each module has only a single instance of the logger class and I'm sure there's a better way to do this, but I'm drawing a blank on being able to use the logging module and still be able to isolate loglines from separate processes.

By default, log messages are written to sys.stderr, not sys.stdout, so you need to redirect stderr instead:
def get_stdout(): # Maybe rename this
sys.stderr = open(str(os.getpid()) + ".out", "w")
foo.foo()

Related

I can't get Python logging module to write to the log file from different module

I have a project that consists of several modules. There is main module (main.py) that creates a TK GUI and loads the data. It passes this data to process.py which processes the data using functions from checks.py. I am trying to implement logging for all the modules to log to a file. In the main.py log messages are written to the log file but in the other modules they are only written to the console. I assume its to do with the import module line executing part of the code before the code in main.py has set up the logger, but i can't work out how to arrange it to avoid that. It seems like a reasonably common question on Stackoverflow, but i couldn't get the other answers to work for me. I am sure I am not missing much. Simplified code is shown below:
Moving the logging code inside and outside of various functions in the modules. The code I used to start me off is the code from Corey Schaffer's Youtube channel.
Main.py
import logging
from process import process_data
def main():
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s:%(name)s:%(message)s')
templogfile = tempfile.gettempdir() + '\\' + 'TST_HA_Debug.log'
file_handler = logging.FileHandler(templogfile)
file_handler.setLevel(logging.DEBUG)
file_handler.setFormatter(formatter)
stream_handler = logging.StreamHandler()
stream_handler.setFormatter(formatter)
logger.addHandler(file_handler)
logger.addHandler(stream_handler)
logger.debug('Logging has started') # This gets written to the file
process_data(data_object) # call process_data in process.py
process.py
import logging
def process_data(data):
logger = logging.getLogger(__name__)
logger.debug('This message is logged by process') #This wont get written to the log file but get written to the console
#do some stuff with data here and log some msgs
return
Main.py will write to the log file, process.py will only write to the console.

I've rewritten your scripts a little so that this code can stand alone. If I changed this too much let me know and I can revisit it. These two files are an example of having it log to file. Note my comments:
## main.py
import logging
from process import process_data
import os
def main():
# Give this logger a name
logger = logging.getLogger("Example")
logger.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s:%(name)s:%(message)s')
# I changed this to the same directory, for convenience
templogfile = os.path.join(os.getcwd(), 'TST_HA_Debug.log')
file_handler = logging.FileHandler(templogfile)
file_handler.setLevel(logging.DEBUG)
file_handler.setFormatter(formatter)
stream_handler = logging.StreamHandler()
stream_handler.setFormatter(formatter)
logger.addHandler(file_handler)
logger.addHandler(stream_handler)
logging.getLogger("Example").debug('Logging has started') # This still gets written to the file
process_data() # call process_data in process.py
if __name__ == '__main__':
main()
## process.py
import logging
def process_data(data=None):
# make sure to grab the correct logger
logger = logging.getLogger("Example")
logger.debug('This message is logged by process') # This does get logged to file now
#do some stuff with data here and log some msgs
return
Why does this work? Because the module-level functions use the default root logger, which is not the one you've configured. For more details on this see these docs. There is a similar question that goes more into depth here.
By getting the configured logger before you start logging, you are able to log to the right configuration. Hope this helps!

Logs not printing in file in python

I am trying to print logs using logger module in python.
Following is the code I am keeping on the top of file.
if __name__ == '__main__':
LOG_FILENAME = '/home/akash/exdion-pdf-extracter/doc/epod.log'
logging.basicConfig(
filename=LOG_FILENAME,
level=logging.DEBUG,
)
There are different files with function calls from one another. I have used the following line to display a line in the logger.
#staticmethod
def initiate_pdf_processing(ct_doc, pt_doc, feature, startAndEndKeyList):
logging.info("testing logger")
...
There are multiple instances of the similar above logger function. But I can't receive the logger output in the designated file. The code and files are huge. However there are a few error generated by the code which are getting printed in the log file.

Use below code bit out of the main namespace. This way, you are defining a logger and creating a log file as global file, and you can call the logger anywhere in the code. A logger code bit below is how I usually code.
logfile = '<your_file_name>.log'
if(os.path.isfile(logfile)):
os.remove(logfile)
file_handler = logging.FileHandler(logfile)
file_handler.setLevel(logging.DEBUG)
file_handler.setFormatter(logging.Formatter(
'%(asctime)s %(pathname)s [%(process)d]: %(levelname)s:: %(message)s'))
logger = logging.getLogger('wbs-server-log')
logger.setLevel(logging.DEBUG)
logger.addHandler(file_handler)

The issue might be that you have to initialize logging above if __name__ == '__main__' block. That way logging will be initialized when you import this as module.
Suggestion for initializing logging:
import logging
log = logging.getLogger(PACKAGE_NAME)
stream_handler = logging.StreamHandler(stream=open(LOG_FILE_NAME, 'a'))
stream_handler.setLevel(logging.DEBUG)
log.addHandler(stream_handler)
log.debug('your message here')
After this you can tweak log message formatting with logging.Formatter.

Python - Logger over multiple files

I created a module named log.py where a function defines how the log will be registered. Here is the atomic code:
import logging
import time
def set_up_log():
"""
Create a logging file.
"""
#
# Create the parent logger.
#
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
#
# Create a file as handler.
#
file_handler = logging.FileHandler('report\\activity.log')
file_handler.setLevel(logging.INFO)
formatter = logging.Formatter('%(asctime)s - %(filename)s - %(name)s - % (levelname)4s - %(message)s')
file_handler.setFormatter(formatter)
logger.addHandler(file_handler)
#
# Start recording.
#
logger.info('______ STARTS RECORDING _______')
if __name__=='__main__':
set_up_log()
A second module named read_file.py is using this log.py to record potential error.
import logging
import log
log.set_up_log()
logger = logging.getLogger(__name__)
def read_bb_file(input_file):
"""
Input_file must be the path.
Open the source_name and read the content. Return the result.
"""
content = list()
logger.info('noi')
try:
file = open(input_file, 'r')
except IOError, e:
logger.error(e)
else:
for line in file:
str = line.rstrip('\n\r')
content.append(str)
file.close()
return content
if __name__ == "__main__":
logger.info("begin execution")
c = read_bb_file('textatraiter.out')
logger.info("end execution")
In the command prompt lauching read_file.py, I get this error:
No handlers could be found for logger "__main__"
My result in the file is the following
2014-05-12 13:32:58,690 - log.py - log - INFO - ______ STARTS RECORDING _______
I read lots of topics here and on Py Doc but it seems I did not understand them properly since I have this error.
I add I would like to keep the log settlement appart in a function and not define it explicitely in my main method.

You have 2 distinct loggers and you're only configuring one.
The first is the one you make in log.py and set up correctly. Its name however will be log, because you have imported this module from read_file.py.
The second logger, the one you're hoping is the same as the first, is the one you assign to the variable logger in read_file.py. Its name will be __main__ because you're calling this module from the command line. You're not configuring this logger.
What you could do is to add a parameter to set_up_log to pass the name of the logger in, e.g.
def set_up_log(logname):
logger = logging.getLogger(logname)
That way, you will set the handlers and formatters for the correct logging instance.
Organizing your logs in a hierarchy is the way logging was intended to be used by Vinay Sajip, the original author of the module. So your modules would only log to a logging instance with the fully qualified name, as given by __name__. Then your application code could set up the loggers, which is what you're trying to accomplish with your set_up_log function. You just need to remember to pass it the relevant name, that's all. I found this reference very useful at the time.

Redirect Python 'print' output to Logger

I have a Python script that makes use of 'Print' for printing to stdout. I've recently added logging via Python Logger and would like to make it so these print statements go to logger if logging is enabled. I do not want to modify or remove these print statements.
I can log by doing 'log.info("some info msg")'. I want to be able to do something like this:
if logging_enabled:
sys.stdout=log.info
print("test")
If logging is enabled, "test" should be logged as if I did log.info("test"). If logging isn't enabled, "test" should just be printed to the screen.
Is this possible? I know I can direct stdout to a file in a similar manner (see: redirect prints to log file)

You have two options:
Open a logfile and replace sys.stdout with it, not a function:
log = open("myprog.log", "a")
sys.stdout = log
>>> print("Hello")
>>> # nothing is printed because it goes to the log file instead.
Replace print with your log function:
# If you're using python 2.x, uncomment the next line
#from __future__ import print_function
print = log.info
>>> print("Hello!")
>>> # nothing is printed because log.info is called instead of print

Of course, you can both print to the standard output and append to a log file, like this:
# Uncomment the line below for python 2.x
#from __future__ import print_function
import logging
logging.basicConfig(level=logging.INFO, format='%(message)s')
logger = logging.getLogger()
logger.addHandler(logging.FileHandler('test.log', 'a'))
print = logger.info
print('yo!')

One more method is to wrap the logger in an object that translates calls to write to the logger's log method.
Ferry Boender does just this, provided under the GPL license in a post on his website. The code below is based on this but solves two issues with the original:
The class doesn't implement the flush method which is called when the program exits.
The class doesn't buffer the writes on newline as io.TextIOWrapper objects are supposed to which results in newlines at odd points.
import logging
import sys
class StreamToLogger(object):
"""
Fake file-like stream object that redirects writes to a logger instance.
"""
def __init__(self, logger, log_level=logging.INFO):
self.logger = logger
self.log_level = log_level
self.linebuf = ''
def write(self, buf):
temp_linebuf = self.linebuf + buf
self.linebuf = ''
for line in temp_linebuf.splitlines(True):
# From the io.TextIOWrapper docs:
# On output, if newline is None, any '\n' characters written
# are translated to the system default line separator.
# By default sys.stdout.write() expects '\n' newlines and then
# translates them so this is still cross platform.
if line[-1] == '\n':
self.logger.log(self.log_level, line.rstrip())
else:
self.linebuf += line
def flush(self):
if self.linebuf != '':
self.logger.log(self.log_level, self.linebuf.rstrip())
self.linebuf = ''
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s:%(levelname)s:%(name)s:%(message)s',
filename="out.log",
filemode='a'
)
stdout_logger = logging.getLogger('STDOUT')
sl = StreamToLogger(stdout_logger, logging.INFO)
sys.stdout = sl
stderr_logger = logging.getLogger('STDERR')
sl = StreamToLogger(stderr_logger, logging.ERROR)
sys.stderr = sl
This allows you to easily route all output to a logger of your choice. If needed, you can save sys.stdout and/or sys.stderr as mentioned by others in this thread before replacing it if you need to restore it later.

A much simpler option,
import logging, sys
logging.basicConfig(filename='path/to/logfile', level=logging.DEBUG)
logger = logging.getLogger()
sys.stderr.write = logger.error
sys.stdout.write = logger.info

Once your defined your logger, use this to make print redirect to logger even with mutiple parameters of print.
print = lambda *tup : logger.info(str(" ".join([str(x) for x in tup])))

You really should do that the other way: by adjusting your logging configuration to use print statements or something else, depending on the settings. Do not overwrite print behaviour, as some of the settings that may be introduced in the future (eg. by you or by someone else using your module) may actually output it to the stdout and you will have problems.
There is a handler that is supposed to redirect your log messages to proper stream (file, stdout or anything else file-like). It is called StreamHandler and it is bundled with logging module.
So basically in my opinion you should do, what you stated you don't want to do: replace print statements with actual logging.

Below snipped works perfectly inside my PySpark code. If someone need in case for understanding -->
import os
import sys
import logging
import logging.handlers
log = logging.getLogger(__name_)
handler = logging.FileHandler("spam.log")
formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
handler.setFormatter(formatter)
log.addHandler(handler)
sys.stderr.write = log.error
sys.stdout.write = log.info
(will log every error in "spam.log" in the same directory, nothing will be on console/stdout)
(will log every info in "spam.log" in the same directory,nothing will be on console/stdout)
to print output error/info in both file as well as in console remove above two line.
Happy Coding
Cheers!!!

Weird stuff happening while importing modules

I hate to give the question this heading but I actually don't know whats happening so here it goes.
I was doing another project in which I wanted to use logging module. The code is distributed among few files & instead of creating separate logger objects for seperate files, I thought of creating a logs.py with contents
import sys, logging
class Logger:
def __init__(self):
formatter = logging.Formatter('%(filename)s:%(lineno)s %(levelname)s:%(message)s')
stdout_handler = logging.StreamHandler(sys.stdout)
stdout_handler.setFormatter(formatter)
self.logger=logging.getLogger('')
self.logger.addHandler(stdout_handler)
self.logger.setLevel(logging.DEBUG)
def debug(self, message):
self.logger.debug(message)
and use this class like (in different files.)
import logs
b = logs.Logger()
b.debug("Hi from a.py")
I stripped down the whole problem to ask the question here. Now, I have 3 files, a.py, b.py & main.py. All 3 files instantiate the logs.Logger class and prints a debug message.
a.py & b.py imports "logs" and prints their debug message.
main.py imports logs, a & b; and prints it own debug message.
The file contents are like this: http://i.imgur.com/XoKVf.png
Why is debug message from b.py printed 2 times & from main.py 3 times?

Specify a name for the logger, otherwise you always use root logger.
import sys, logging
class Logger:
def __init__(self, name):
formatter = logging.Formatter('%(filename)s:%(lineno)s %(levelname)s:%(message)s')
stdout_handler = logging.StreamHandler(sys.stdout)
stdout_handler.setFormatter(formatter)
self.logger=logging.getLogger(name)
self.logger.addHandler(stdout_handler)
self.logger.setLevel(logging.DEBUG)
def debug(self, message):
self.logger.debug(message)
http://docs.python.org/howto/logging.html#advanced-logging-tutorial :
A good convention to use when naming loggers is to use a module-level
logger, in each module which uses logging, named as follows:
logger = logging.getLogger(__name__)

logging.getLogger('') will return exactly the same object each time you call it. So each time you instantiate a Logger (why use old-style classes here?) you are attaching one more handler resulting in printing to one more target. As all your targets pointing to the same thing, the last call to .debug() will print to each of the three StreamHandler objects pointing to sys.stdout resulting in three lines being printed.

First. Don't create your own class of Logger.
Just configure the existing logger classes with exising logging configuration tools.
Second. Each time you create your own class of Logger you also create new handlers and then attach the new (duplicating) handler to the root logger. This leads to duplication of messages.
If you have several modules that must (1) run stand-alone and (2) also run as part of a larger, composite, application, you need to do this. This will assure that logging configuration is done only once.
import logging
logger= logging.getLogger( __file__ ) # Unique logger for a, b or main
if __name__ == "__main__":
logging.basicConfig( stream=sys.stdout, level=logging.DEBUG, format='%(filename)s:%(lineno)s %(levelname)s:%(message)s' )
# From this point forward, you can use the `logger` object.
logger.info( "Hi" )

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Capture log output from subprocesses - python

By default, log messages are written to sys.stderr, not sys.stdout, so you need to redirect stderr instead: def get_stdout(): # Maybe rename this sys.stderr = open(str(os.getpid()) + ".out", "w") foo.foo()

Related

I can't get Python logging module to write to the log file from different module

Logs not printing in file in python

Python - Logger over multiple files

Redirect Python 'print' output to Logger

Weird stuff happening while importing modules

Categories

Resources