I found some code online that generally works, but I want to use it multiple times in the same program (write different things to different files, while still printing to the screen the whole time).
That is to say, when it closes, I think sys.stdout closes, so printing at all, and using this class again fails. I tried reimporting sys, and other dumb stuff, but I can't get it to work.
Here's the site, and the code
groups.google.com/group/comp.lang.python/browse_thread/thread/d25a9f5608e473af/
import sys
class MyWriter:
def __init__(self, stdout, filename):
self.stdout = stdout
self.logfile = file(filename, 'a')
def write(self, text):
self.stdout.write(text)
self.logfile.write(text)
def close(self):
self.stdout.close()
self.logfile.close()
writer = MyWriter(sys.stdout, 'log.txt')
sys.stdout = writer
print 'test'
You are trying to reproduce poorly something that is done very well by the Python Standard Library; please check the logging module.
With this module you can do exactly what you want, but in a much simpler, standard, and extensible manner. You can proceed as follows (this example is a copy/paste from the logging cookbook):
Let’s say you want to log to console and file with different message
formats and in differing circumstances. Say you want to log messages
with levels of DEBUG and higher to file, and those messages at level
INFO and higher to the console. Let’s also assume that the file should
contain timestamps, but the console messages should not. Here’s how
you can achieve this:
import logging
# set up logging to file - see previous section for more details
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s %(name)-12s %(levelname)-8s %(message)s',
datefmt='%m-%d %H:%M',
filename='/temp/myapp.log',
filemode='w')
# define a Handler which writes INFO messages or higher to the sys.stderr
console = logging.StreamHandler()
console.setLevel(logging.INFO)
# set a format which is simpler for console use
formatter = logging.Formatter('%(name)-12s: %(levelname)-8s %(message)s')
# tell the handler to use this format
console.setFormatter(formatter)
# add the handler to the root logger
logging.getLogger().addHandler(console)
# Now, we can log to the root logger, or any other logger. First the root...
logging.info('Jackdaws love my big sphinx of quartz.')
# Now, define a couple of other loggers which might represent areas in your
# application:
logger1 = logging.getLogger('myapp.area1')
logger2 = logging.getLogger('myapp.area2')
logger1.debug('Quick zephyrs blow, vexing daft Jim.')
logger1.info('How quickly daft jumping zebras vex.')
logger2.warning('Jail zesty vixen who grabbed pay from quack.')
logger2.error('The five boxing wizards jump quickly.')
When you run this, on the console you will see
root : INFO Jackdaws love my big sphinx of quartz.
myapp.area1 : INFO How quickly daft jumping zebras vex.
myapp.area2 : WARNING Jail zesty vixen who grabbed pay from quack.
myapp.area2 : ERROR The five boxing wizards jump quickly.
and in the file you will see something like
10-22 22:19 root INFO Jackdaws love my big sphinx of quartz.
10-22 22:19 myapp.area1 DEBUG Quick zephyrs blow, vexing daft Jim.
10-22 22:19 myapp.area1 INFO How quickly daft jumping zebras vex.
10-22 22:19 myapp.area2 WARNING Jail zesty vixen who grabbed pay from quack.
10-22 22:19 myapp.area2 ERROR The five boxing wizards jump quickly.
As you can see, the DEBUG message only shows up in the file. The other
messages are sent to both destinations.
This example uses console and file handlers, but you can use any
number and combination of handlers you choose.
Easy-peasy with Python 3.3 and above
Starting with Python 3.3, doing so has become significantly easier since logging.basicConfig now accepts the handlers = argument.
import logging
level = logging.INFO
format = ' %(message)s'
handlers = [logging.FileHandler('filename.log'), logging.StreamHandler()]
logging.basicConfig(level = level, format = format, handlers = handlers)
logging.info('Hey, this is working!')
Note however, that certain Python modules may also be posting logging messages to the INFO level.
This is where it comes handy to create a custom logging level, called for example OK, 5 levels above the default INFO level and 5 levels below the default WARNING level.
I know this is an old question, and the best answer is just to use logging for its intended purpose, but I just wanted to point out that if you're concerned only with affecting calls specifically to print (and not other interaction with sys.stdout), and you just want to paste a few lines into some old one-off script, there's nothing stopping you from simply reassigning the name to a different function which writes to two different files, since print is a function in Python 3+. You could even, god forbid, use a lambda with an or chain for the quickest, dirtiest solution out there:
old_print = print
log_file = open("logfile.log", "a")
print = lambda *args, **kw: old_print(*args, **kw) or old_print(*args, file=log_file, **kw)
print("Hello console and log file")
# ... more calls to print() ...
log_file.close()
Or for true fire-and-forget:
import atexit
old_print = print
log_file = open("logfile.log", "a")
atexit.register(log_file.close)
print = lambda *args, **kw: old_print(*args, **kw) or old_print(*args, file=log_file, **kw)
# ... do calls to print(), and you don't even have to close the file afterwards ...
It works fine assuming the program exits properly, but please no one use this in production code, just use logging :)
Edit: If you value some form of structure and want to write to the log file in real-time, consider something like:
from typing import Callable
def print_logger(
old_print: Callable,
file_name: str,
) -> Callable:
"""Returns a function which calls `old_print` twice, specifying a `file=` on the second call.
Arguments:
old_print: The `print` function to call twice.
file_name: The name to give the log file.
"""
def log_print(*args, **kwargs):
old_print(*args, **kwargs)
with open(file_name, "a") as log_file:
old_print(*args, file=log_file, **kwargs)
return log_print
And then invoke as follows:
print = print_logger(print, "logs/my_log.log")
Remove the line that's doing what you explicitly say you don't want done: the first line of close(), which closes stdout.
That is to say, when it closes, I think sys.stdout closes, so printing
at all, and using this class again fails. I tried reimporting sys, and
other dumb stuff, but I can't get it to work.
To answer your question, you should not be closing stdout. The python interpreter opens stdout, stdin and stderror at startup. In order for print to work, the interpreter requires stdout to be open. Reimporting sys does not do anything once a module has been loaded. You would need to reload the module. In this particular case, I am not sure a reload would fix the problem since sys.stdout allows stdout to be used as a file object.
Additionally, I think you have a bug in your code which may be causing print to
break. In line 2 you are assigning a MyWriter object to sys.stdout. This may by closing stdout when the garbage collector deletes the unused stdout file object.
writer = MyWriter(sys.stdout, 'log.txt')
sys.stdout = writer
Related
I am using the logging module in python inside a function. A simplified structure of the code is like below.
def testfunc(df):
import logging
import sys
from datetime import datetime
logger = logging.getLogger()
logger.setLevel(logging.INFO)
# to print to the screen
ch = logging.StreamHandler(sys.__stdout__)
ch.setLevel(logging.INFO)
logger.addHandler(ch)
#to print to file
fh = logging.FileHandler('./data/treatment/Treatment_log_'+str(datetime.today().strftime('%Y-%m-%d'))+'.log')
fh.setLevel(logging.INFO)
logger.addHandler(fh)
#several lines of code and some information like:
logger.info('Loop starting...')
for i in range(6): # actually a long for-loop
#several lines of somewhat slow code (even with multiprocessing) and some information like:
logger.info('test '+str(i))
logging.shutdown()
return None
So, I know:
the logger need to be shutdown (logging.shutdown());
and it is included at the end of the function.
The issue is:
the actual function deals with subsets of a data frame, and sometimes it results in error because no sufficient data, etc.
If I run the function again, what I see is all messages are repeated twice (or even more, if I need to run again).
The situation remind the reported here, here, and here, for example... But slightly different...
I got, it is because the logging module was not shutdown, neither the handlers were removed... And I understand for the final function, I should anticipate such situations, and include steps to avoid raising errors, like shutdown the logger and finish the function, etc... But currently I am even using the log information to identify such situations...
My question is: how can I shut down it once such situation (function aborted because error) happened? ... in my current situation, in which I am just testing the code? Currently, the way to make it stop is to start the new console in Spyder (in my understanding, restarting the kernel). What is the correct procedure in this situation?
I appreciate any help...
I suppose you can check first to see if there is any existing logger
loggers = [logging.getLogger(name) for name in logging.root.manager.loggerDict]
if there isn't, you can create the logger
if there is, don't create a new one
Alternatively, you can have another file setting the logger, and calling this file through an subprocess.Popen() or similar.
The code for the first option is from here How to list all existing loggers using python.logging module
I'm implementing a simple logging class that writes out some messages to a log file. I have a doubt on how to manage the opening/closing of the file in a sensible and pythonic way.
I understood that the idiomatic way to do the writing in files is via the with statement. Therefore this is a simplified version of the code I have:
class Logger():
def __init__(self, filename, mode='w', name='root'):
self.filename = filename
self.name = name
# remove the previous content of the file if mode for logger is 'w'
if mode == 'w':
with open(self.filename, 'w') as f:
f.write('')
def info(self, msg):
with open(self.filename, 'a') as f:
f.write(f'INFO:{self.name}:{msg}\n')
logger = Logger('log.txt')
logger.info('Starting program')
The problem is that this implementation will open and close the file as many times as the logger is called, which will be hundred of times. I'm concerned with this being an overheat of the program (the runtime of this program is important). It perhaps would be more sensible to open the file at the moment of creation of the logger, and close it when the program finishes. But this goes against the "use width" rule, and certainly there is a serious risk that I (or the user of the class) will forget to manually close the file at the end. Other problem of this approach is that if I want to create different loggers that dump to the same file, I'll have to add careful checks to know whether the file is already open or not by previous loggers...
So all in all, what's the most pythonic and sensible way to handle the opening/closing of files in this context?
While I agree with the other comments that the most pythonic way is to use the standard lib, I'll try to answer your question as it was asked.
I think the with construct is a great construct but it doesn't mean it works in every situation. Opening and saving a file handle for continual use is not unpythonic if it makes sense in your situation (IMO). Opening, do something, and closing it in the same function with try/except/finally blocks would be unpythonic. I think it may be preferred to only open it when you first try to use it (instead of at creation time). But that can depend on the rest of the application.
If you start creating different loggers that write to the same file, if in the same process, I would think the goal would be to have a single open file handle that all the loggers write to instead of each logger having their own handle they write to. But multi-instance and multi-process logging synchronization is where the stdlib shines, so...you know...your mileage may vary.
In my Flask application I have implemented a logging system using the logging library. It is currently run in a function below:
if __name__ == "__main__":
"""[Runs the webserver.
Finally block is used for some logging management. It will first shut down
logging, to ensure no files are open, then renames the file to 'log_'
+ the current date, and finally moves the file to the /logs archive
directory]
"""
try:
session_management.clean_uploads_on_start(UPLOAD_FOLDER)
app.run(debug=False)
finally:
try:
logging.shutdown()
new_log_file_name = log_management.rename_log(app.config['DEFAULT_LOG_NAME'])
log_management.move_log(new_log_file_name)
except FileNotFoundError:
logging.warning("Current log file not found")
except PermissionError:
logging.warning("Permissions lacking to rename or move log.")
I discovered that the file is not renamed and moved if (either) the cmd prompt is force closed, or if the server crashes. I thought it might be better to put the rename and move into the initial 'try' block of the function, prior to the server starting, but I run into issues because I have a config file (which is imported in this script) which has the following code:
logging.basicConfig(filename='current_log.log', level=logging.INFO,
filemode='a',
format='%(asctime)s:%(levelname)s:%(message)s')
I have tried to do something like the below, but I still run into permission errors, but I think I am still running into errors because the log_management script also imports config. Further, I could not find a function which starts the logging system similar to logging.shutdown() which is used upon the system ending, otherwise I would shut it down, move the file (if it exists) and the start it back up.
try:
session_management.clean_uploads_on_start(UPLOAD_FOLDER)
log_management.check_log_on_startup(app.config['DEFAULT_LOG_NAME'])
import config
app.run(debug=False)
finally:
try:
logging.shutdown()
new_log_file_name = log_management.rename_log(app.config['DEFAULT_LOG_NAME'])
log_management.move_log(new_log_file_name)
except FileNotFoundError:
logging.warning("Current log file not found")
except PermissionError:
logging.warning("Permissions lacking to rename or move log.")
# (in another script)
def check_log_on_startup(file_name):
if os.path.exists(file_name):
move_log(rename_log(file_name))
Any suggestions much welcomed, because I feel like I'm at a brick wall!
As you have already found out, trying to perform cleanups at the end of your process life cycle has the potential to fail if the process terminates uncleanly.
The issue with performing the cleanup at the start is, that you apparently call logging.basicConfig from your import before attempting to move the old log file.
This leads to the implicitly created FileHandler holding an open file object on the existing log when you attempt to rename and move it. Depending on the file system you are using, this might not be met with joy.
If you want to move the handling of potential old log files to the start of your application completely, you have to perform the renaming and moving before you call logging.basicConfig, so you'll have to remove it from your import and add it to the log_management somehow.
As an alternative, you could move the whole handling of log files to the logging file handler by subclassing the standard FileHandler class, e.g:
import logging
import os
from datetime import datetime
class CustomFileHandler(logging.FileHandler):
def __init__(self, filename, archive_path='archive', archive_name='log_%Y%m%d', **kwargs):
self._archive = os.path.join(archive_path, archive_name)
self._archive_log(filename)
super().__init__(filename, **kwargs)
def _archive_log(self, filepath):
if os.path.exists(filepath):
os.rename(filepath, datetime.now().strftime(self._archive))
def close(self):
super().close()
self._archive_log(self.baseFilename)
With this, you would configure your logging like so:
hdler = CustomFileHandler('current.log')
logging.basicConfig(level=logging.INFO, handlers=[hdler],
format='%(asctime)s:%(levelname)s:%(message)s')
The CustomFileHandler will check for, and potentially archive, old logs during initialization. This will deal with leftovers after an unclean process termination where the shutdown cleanup cannot take place. Since the parent class initializer is called after the log archiving is attempted, there is not yet an open handle on the log that would cause a PermissionError.
The overwritten close() method will perform the archiving on a clean process shutdown.
This should remove the need for the dedicated log_management module, at least as far as the functions you show in your code are concerned. rename_log, move_log and check_log_on_startup are all encapsulated in the CustomFileHandler. There is also no need to explicitly call logging.shutdown().
Some notes:
The reason you cannot find a start function equivalent to logging.shutdown() is that the logging system is started/initialized when you import the logging module. Among other things, it instantiates the implicit root logger and registers logging.shutdown as exit handler via atexit.
The latter is the reason why there is no need to explicitly call logging.shutdown() with the above solution. The Python interpreter will call it during finalization when preparing for interpreter shutdown due to the exit handler registration. logging.shutdown() then iterates through the list of registered handlers and calls their close() methods, which will perform the log archiving during a clean shutdown.
Depending on the method you choose for moving (and renaming) the old log file, the above solution might need some additional safeguards against exceptions. os.rename will raise an exception if the destination path already exists, i.e. when you have already stopped and started your process previously on the same day while os.replace would silently overwrite the existing file. See more details about moving files via Python here.
Thus I would recommend to name the archived logs not only by current date but also by time.
In the above, adding the current date to the archive file name is done via datetime's strftime, hence the 'log_%Y%m%d' as default for the archive_name parameter of the custom file handler. The characters with a preceding % are valid format codes that strftime() replaces with the respective parts of the datetime object it is called on. To append the current time to the archive log file name you would simply append the respective format codes to the archive_name, e.g.: 'log_%Y%m%d_%H%M%S' which would result in a log name such as log_20200819_123721.
I have a very simple function that is just a list of data writes, but each write takes 5-10s, so the function takes about an hour to run. Since there is no loop, there is no iteration variable. What is the best way to update progress to the user?
Have you considered the logging module? You can create different kinds of handlers to, well, handle log messages. Here's a simple example, but the general idea is you can just put logging messages in your script that write to a file, a stream that prints to the console, or something else.
import datetime
import logging
import time
logger = logging.getLogger('a_name')
logger.setLevel(logging.DEBUG)
sh = logging.StreamHandler() # prints to console
logger.addHandler(sh)
with open('/tmp/test_file.txt', 'w') as f:
logger.info('beginning writing file at ' + str(datetime.datetime.now()))
time.sleep(30) # this is a proxy for doing some file writing
logger.info('the time now is' + str(datetime.datetime.now()))
...
loggering.info('file done being written')
You might want to look info formatting the log messages so you don't have to include the datetime str in an in-elegant way like I did.
I have searched here extensively without much luck specifically in regards to my question.
Currently, my program has issues relating to file descriptor overflows (too many open files), nearly all of which are directing to my single log file. My program is set out in this fashion:
In my main:
# Initializes program log handler
log = Log()
log.setup_custom_logger('root')
logger = logging.getLogger('root')
Within the Log class:
def setup_custom_logger(self, name):
""" Sets base log handler for the program and log entry formatting """
# Create logger
logger = logging.getLogger(name)
log_level = self.getLogLevel()
logger.setLevel(log_level)
# Sets formatting, output file and handler
handler = logging.FileHandler(
os.getenv("HOME") + config.LOG_DIRECTORY + 'qtgtool.log')
handler.setFormatter(self.getFormat(log_level))
# Add handler to logger
logger.addHandler(handler)
And within any other class in the program this is called in their init:
logger = logging.getLogger('root')
I have checked, and there is only ever one FileHandler object and that is used by all classes. As such, I have no idea why it would create so many file descriptors of my log when class objects are created... Or am I missing something? Is it a case of too many class objects all with logging capabilities?
The output from the traceback:
IOError: [Errno 24] Too many open files:/path/to/file/being/read.txt
This corresponds with lsof -p indicating 1024 open files (nearly all my log file). As a side note, I have seen the options for increasing the open file count. I do not want to do this, as I find this to totally miss the point of trying to fix the problem.
Further debugging by using this class instead of logging.FileHandler():
class FHandler(logging.FileHandler):
def __init__(self, filename, encoding=None, delay=0):
logging.FileHandler.__init__(self, filename, 'w', encoding, delay)
def _open(self):
logging.FileHandler._open(self)
print 'Opened file'
This prints a single 'Opened file' to the console, whilst lsof (already, early stages of the program) presents over 100 log file references.
Many thanks and apologies for the deficiencies in my programming syntax.
Please try the following code to close the file:
logger.removeHandler(handler)
"handler" is the created "FileHandler" instance.
From here.