Dynamically switching log storage folder - python

I am using Python's logging to log execution of functions and other actions within an application. The log files are stored in a remote folder, which is accessible automatically when I connect to VPN (let's say \remote\directory). That is normal situation, 99% of the time there is a connection and log is stored without errors.
I need a solution for a situation when either the VPN connection or Internet connection is lost and the logs are temporarily stored locally. I think that on each time something is attempted to be logged, I need to run a check if the remote folder is accessible. I couldn't really find a solution, but I guess I need to modify the FileHandler somehow.
TLDR: You can already scroll down to blues' answer and my UPDATE section - there is my latest attempt to solve the issue.
Currently my handler is set like this:
log = logging.getLogger('general')
handler_error = logging.handlers.RotatingFileHandler(log_path+"\\error.log", 'a', encoding="utf-8")
log.addHandler(handler_error)
Here is a condition that sets the log path but only once - when logging is initialized. If I think correctly, I would like to run this condition each time the
if (os.path.isdir(f"\\\\remote\\folder\\")): # if remote is accessible
log_path = f"\\\\remote\\folder\\dev\\{d.year}\\{month}\\"
os.makedirs(os.path.dirname(log_path), exist_ok=True) # create this month dir if it does not exist, logging does not handle that
else: # if remote is not accesssible
log_path = f"localFiles\\logs\\dev\\{d.year}\\{month}\\"
log.debug("Cannot access the remote directory. Are you connected to the internet and the VPN?")
I have found a related thread, but was not able to adjust it to my own needs: Dynamic filepath & filename for FileHandler in logger config file in python
Should I dig deeper into custom Handler or is there some other way? Would be enough if I could call my own function that changed the logging path if needed (or change logger to one with a proper path) when logging is being executed.
UPDATE:
Per blues's answer, I have tried modifying a handler to suit my needs. Unfortunately, the code below, in which I try to switch baseFilename between local and remote paths, does not work. The logger always saves the log to local log file (that has been set while initializing logger). Thus, I think that my attempts to modify the baseFilename do not work?
class HandlerCheckBefore(RotatingFileHandler):
print("handler starts")
def emit(self, record):
calltime = date.today()
if os.path.isdir(f"\\\\remote\\Path\\"): # if remote is accessible
print("handler remote")
# create remote folders if not yet existent
os.makedirs(os.path.dirname(f"\\\\remote\\Path\\{calltime.year}\\{calltime.strftime('%m')}\\"), exist_ok=True)
if (self.level >= 20): # if error or above
self.baseFilename = f"\\\\remote\\Path\\{calltime.year}\\{calltime.strftime('%m')}\\error.log"
else:
self.baseFilename = f"\\\\remote\\Path\\{calltime.year}\\{calltime.strftime('%m')}\\{calltime.strftime('%d')}-{calltime.strftime('%m')}.log"
super().emit(record)
else: # save to local
print("handler local")
if (self.level >= 20): # error or above
self.baseFilename = f"localFiles\\logs\\{calltime.year}\\{calltime.strftime('%m')}\\error.log"
else:
self.baseFilename = f"localFiles\\logs\\{calltime.year}\\{calltime.strftime('%m')}\\{calltime.strftime('%d')}-{calltime.strftime('%m')}.log"
super().emit(record)
# init the logger
handler_error = HandlerCheckBefore(f"\\\\remote\\Path\\{calltime.year}\\{calltime.strftime('%m')}\\error.log", 'a', encoding="utf-8")
handler_error.setLevel(logging.ERROR)
handler_error.setFormatter(fmt)
log.addHandler(handler_error)

The best way to solve this is indeed to create a custom Handler for this. You can either check before each write that the directory is still there, or you could attempt to write the log and handle the resulting error in handleError which all loggers call when an exception occurs during emit(). I recommend the former. The code below shows how both could be implemented:
import os
import logging
from logging.handlers import RotatingFileHandler
class GrzegorzRotatingFileHandlerCheckBefore(RotatingFileHandler):
def emit(self, record):
if os.path.isdir(os.path.dirname(self.baseFilename)): # put appropriate check here
super().emit(record)
else:
logging.getLogger('offline').error('Cannot access the remote directory. Are you connected to the internet and the VPN?')
class GrzegorzRotatingFileHandlerHandleError(RotatingFileHandler):
def handleError(self, record):
logging.getLogger('offline').error('Something went wrong when writing log. Probably remote dir is not accessible')
super().handleError(record)
log = logging.getLogger('general')
log.addHandler(GrzegorzRotatingFileHandlerCheckBefore('check.log'))
log.addHandler(GrzegorzRotatingFileHandlerHandleError('handle.log'))
offline_logger = logging.getLogger('offline')
offline_logger.addHandler(logging.FileHandler('offline.log'))
log.error('test logging')

Related

Python watch for file changes and identify user

I need to produce a small script that will watch for accidental changes made by users to a large shared file structure.
I have found I can get the change events using the ReadDirectoryChanges API as per
http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html
However I cant see how I can identify the user account that has made the changes so that I can send out a notification.
Is it possible to get the name of the user account that moved the file/directory.
Tricky question, I'll get you an answer in two ways:
First, as optional part, you can to watch file modifications themselves, and add custom actions.
Example of file modification tracking, working on Windows / Linux / Mac / BSD
import time
import watchdog.events
import watchdog.observers
class StateHandler(watchdog.events.PatternMatchingEventHandler):
def on_modified(self, event):
print(event.event_type)
print(event.key)
print(event.src_path)
# Add your code here to do whatever you want on file modification
def on_created(self, event):
pass
def on_moved(self, event):
pass
def on_deleted(self, event):
pass
fs_event_handler = StateHandler()
fs_observer = watchdog.observers.Observer()
fs_observer.schedule(fs_event_handler, r'C:\Users\SomeUser\SomeFolder', recursive=True)
fs_observer.start()
try:
while True:
time.sleep(2)
except KeyboardInterrupt:
fs_observer.stop()
fs_observer.join()
Using the above filesystem observer, you can trigger security event log reviews.
You might also trigger them as scheduled task, but it's more fun to trigger them on file system modifications.
In order for security event logs to contain file modification information, you need to enable file auditing for the required directories using SACL lists (right click on your folder, security, auditing).
Then you can go through the security logs on file events.
Going through security logs can be done with windows_tools.
Get it installed with python -m pip install windows_tools.wmi_queries (obviously only works under Windows)
Then do the following:
from windows_tools.wmi_queries import *
result = query_wmi('SELECT * FROM Win32_NTLogEvent WHERE Logfile="Security" AND TimeGenerated > "{}"'.format(create_current_cim_timestamp(hour_offset=1)))
for r in result:
print(r)
You can add WHERE clauses like EventCode={integer} in order to filter only the events (file modifications or else) you need.
Usually the event codes you're searching are 4656, 4660, 4663, 4670 (open delete, edit, create).
See This microsoft article in order to know what WHERE clauses the event log class accepts.
DISCLAIMER: I'm the author of windows_tools package.
On an UNIX environment you can use os and pwd to retrieve the user account that has changed the file.
import os
import pwd
file_stat = os.stat("<changed_file>")
user_name = pwd.getpwuid(file_stat.st_uid).pw_name

Using Python Logging module to save logs to S3, how to capture all levels of logs for all modules?

It's my first time using the logging module in Python(3.7). My code uses imported modules that also have their own log statements. When I first added log statements to my code, I didn't use getLogger(). I just used logging.basicConfig(filename) and called logger.debug() directly to log statements. When I did this, all the logs from both my script and also all the imported modules was output to the same file together.
Now I need to convert my code to save logs to s3 instead of a file. I tried the solution mentioned in How Can I Write Logs Directly to AWS S3 from Memory Without First Writing to stdout? (Python, boto3) - Stack Overflow but I have two issues with it:
None of the 'prefixes' are present in the output when I check on s3.
Only INFO statements are showing up. I was under the impression that logging.basicConfig(level=logging.INFO) would make it would output all logs at or above level INFO, but I'm only seeing INFO. Also, only INFO logs get printed to stdout, when before all levels were. I don't know why the 'prefixes' are missing.
from psaw import PushshiftAPI
api = PushshiftAPI()
import time
import logging
import boto3
import io
import atexit
def write_logs(body, bucket, key):
s3 = boto3.client("s3")
s3.put_object(Body=body.getvalue(), Bucket=bucket, Key=key)
logging.basicConfig(level=logging.INFO)
log = logging.getLogger()
log_stringio = io.StringIO()
handler = logging.StreamHandler(log_stringio)
log.addHandler(handler)
def collectRange(sub,start,end):
atexit.register(write_logs, body=log_stringio, bucket="<...>", key=f'{sub}/log.txt')
s3 = boto3.resource('s3')
object = s3.Object('<...>', f'{sub}/{sub}#{start}-{end}.csv')
now = time.time()
logging.info(f'Start Time:{now}')
logging.debug('First request')
gen = api.search_comments(after=start, before=end,<...>, subreddit=sub)
r=next(gen)
<...>
quit()
Output:
Found credentials in shared credentials file: ~/.aws/credentials
Start Time:1591310443.7060978
https://api.pushshift.io/reddit/comment/search?<...>
https://api.pushshift.io/reddit/comment/search?<...>
Desired output:
INFO:botocore.credentials:Found credentials in shared credentials file: ~/.aws/credentials
INFO:root:Start Time:1591310443.7060978
DEBUG:root:First request
INFO:psaw.PushshiftAPI:https://api.pushshift.io/reddit/comment/search?<...>
DEBUG:psaw.PushshiftAPI:<whatever is usually here>
DEBUG:psaw.PushshiftAPI:<whatever is usually here>
INFO:psaw.PushshiftAPI:https://api.pushshift.io/reddit/comment/search?<...>
DEBUG:psaw.PushshiftAPI:<whatever is usually here>
DEBUG:psaw.PushshiftAPI:<whatever is usually here>
Any help is appreciated. Thanks.
You can at least add the level name (or time) by following this documentation:
Changing the format of displayed messages.
And to get DEBUG as well you need to use the following instead of INFO
logging.basicConfig(..., level=logging.DEBUG)

Changes not written to file correctly in Python 2.7

For a few days now, I have been struggling with a problem, namely that the settings written by my settings class for a parser are not persistent when the program gets restarted. This problem occurs only on Windows, but in both Python x86 and x64 environments and when compiled using PyInstaller. It also does not matter whether the program is run as Administrator or not.
When the program first runs, write_def(self) is called by the constructor. This function writers teh defaults correctly to the file specified. After this, read_set(self) is called so the class variables are set.These class variables then do match the default values.
In another file, namely frames.py, write_set(self) is called, and all settings are passed as arguments. Using print statements I have asserted that the write_set(self) function receives the correct values. No errors occur when writing the settings to the file, and when running read_set(self) again, the new settings are read correctly and this is also shown in the GUI.
However, when closing the program and running it again, the default settings are again shown. This is not behaviour I expected.
Below I have added the settings class implementing a cPickle. When using pickle the behaviour is the same. When using shelve as in this file, the behaviour is the same. When using dill, the behaviour is the same. When implementing a ConfigParser.RawConfigParser (in the configparser branch of the GitHub repository linked to earlier), the behaviour is the same, and additionally when viewing the settings file in a text editor it is visible that the settings in the file are not updated.
When running the same code on Linux (Ubuntu 16.04.1 LTS with Python 2.7), everything works as expected with pickle and shelve versions. The settings are correctly saved and loaded from the file. Am I doing something wrong? Is it a Windows-specific issue with Python?
Thank you in advance for any help!
RedFantom.
# Written by RedFantom, Wing Commander of Thranta Squadron and Daethyra, Squadron Leader of Thranta Squadron
# Thranta Squadron GSF CombatLog Parser, Copyright (C) 2016 by RedFantom and Daethyra
# For license see LICENSE
# UI imports
import tkMessageBox
# General imports
import getpass
import os
import cPickle
# Own modules
import vars
# Class with default settings for in the settings file
class defaults:
# Version to display in settings tab
version = "2.0.0_alpha"
# Path to get the CombatLogs from
cl_path = 'C:/Users/' + getpass.getuser() + "/Documents/Star Wars - The Old Republic/CombatLogs"
# Automatically send and retrieve names and hashes of ID numbers from the remote server
auto_ident = str(False)
# Address and port of the remote server
server = ("thrantasquadron.tk", 83)
# Automatically upload CombatLogs as they are parsed to the remote server
auto_upl = str(False)
# Enable the overlay
overlay = str(True)
# Set the overlay opacity, or transparency
opacity = str(1.0)
# Set the overlay size
size = "big"
# Set the corner the overlay will be displayed in
pos = "TL"
# Set the defaults style
style = "plastik"
# Class that loads, stores and saves settings
class settings:
# Set the file_name for use by other functions
def __init__(self, file_name = "settings.ini"):
self.file_name = file_name
# Set the install path in the vars module
vars.install_path = os.getcwd()
# Check for the existence of the specified settings_file
if self.file_name not in os.listdir(vars.install_path):
print "[DEBUG] Settings file could not be found. Creating a new file with default settings"
self.write_def()
self.read_set()
else:
try:
self.read_set()
except:
tkMessageBox.showerror("Error", "Settings file available, but it could not be read. Writing defaults.")
self.write_def()
vars.path = self.cl_path
# Read the settings from a file containing a pickle and store them as class variables
def read_set(self):
with open(self.file_name, "r") as settings_file_object:
settings_dict = cPickle.load(settings_file_object)
self.version = settings_dict["version"]
self.cl_path = settings_dict["cl_path"]
self.auto_ident = settings_dict["auto_ident"]
self.server = settings_dict["server"]
self.auto_upl = settings_dict["auto_upl"]
self.overlay = settings_dict["overlay"]
self.opacity = settings_dict["opacity"]
self.size = settings_dict["size"]
self.pos = settings_dict["pos"]
self.style = settings_dict["style"]
# Write the defaults settings found in the class defaults to a pickle in a file
def write_def(self):
settings_dict = {"version":defaults.version,
"cl_path":defaults.cl_path,
"auto_ident":bool(defaults.auto_ident),
"server":defaults.server,
"auto_upl":bool(defaults.auto_upl),
"overlay":bool(defaults.overlay),
"opacity":float(defaults.opacity),
"size":defaults.size,
"pos":defaults.pos,
"style":defaults.style
}
with open(self.file_name, "w") as settings_file:
cPickle.dump(settings_dict, settings_file)
# Write the settings passed as arguments to a pickle in a file
# Setting defaults to default if not specified, so all settings are always written
def write_set(self, version=defaults.version, cl_path=defaults.cl_path,
auto_ident=defaults.auto_ident, server=defaults.server,
auto_upl=defaults.auto_upl, overlay=defaults.overlay,
opacity=defaults.opacity, size=defaults.size, pos=defaults.pos,
style=defaults.style):
settings_dict = {"version":version,
"cl_path":cl_path,
"auto_ident":bool(auto_ident),
"server":server,
"auto_upl":bool(auto_upl),
"overlay":bool(overlay),
"opacity":float(opacity),
"size":str(size),
"pos":pos,
"style":style
}
with open(self.file_name, "w") as settings_file_object:
cPickle.dump(settings_dict, settings_file_object)
self.read_set()
Sometimes it takes a while to get to an answer, and I just thought of this: What doesn't happen on Linux that does happen on Windows? The answer to that question is: Changing the directory to the directory of the files being parsed. And then it becomes obvious: the settings are stored correctly, but the folder where the settings file is created is changed during the program, so the settings don't get written to the original settings file, but a new settings file is created in another location.

Rename log file in Python while file keeps writing any other logs

I am using the Python logger mechanism for keeping a record of my logs. I have two types of logs,
one is the Rotating log (log1, log2, log3...) and a non-rotating log called json.log (which has json logs in it as the name suggests).
The log files are created when the server is started and close when the app is closed.
What I am trying to do in general is: When I press the import button on my page, to have all json logs saved on the sqlite db.
The problem I am facing is:
When I try to rename the json.log file like this:
source_file = "./logs/json.log"
snapshot_file = "./logs/json.snapshot.log"
try:
os.rename(source_file, snapshot_file)
I get the windowsError: [Error 32] The process cannot access the file because it is being used by another process
and this is because the file is being used by the logger continuously. Therefore, I need to "close" the file somehow so I can do my I/O operation successfully.
The thing is that this is not desirable because logs might be lost until the file is closed, then renamed and then "re-created".
I was wondering if anyone came across such scenario again and if any practical solution was found.
I have tried something which works but does not seem convenient and not sure if it is safe so that any logs are not lost.
My code is this:
source_file = "./logs/json.log"
snapshot_file = "./logs/json.snapshot.log"
try:
logger = get_logger()
# some hackish way to remove the handler for json.log
if len(logger.handlers) > 2:
logger.removeHandler(logger.handlers[2])
if not os.path.exists(snapshot_file):
os.rename(source_file, snapshot_file)
try:
if type(logger.handlers[2]) == RequestLoggerHandler:
del logger.handlers[2]
except IndexError:
pass
# re-adding the logs file handler so it continues writing the logs
json_file_name = configuration["brew.log_file_dir"] + os.sep + "json.log"
json_log_level = logging.DEBUG
json_file_handler = logging.FileHandler(json_file_name)
json_file_handler.setLevel(json_log_level)
json_file_handler.addFilter(JSONLoggerFiltering())
json_file_handler.setFormatter(JSONFormatter())
logger.addHandler(json_file_handler)
... code continues to write the logs to the db and then delete the json.snapshot.file
until the next time the import button is pressed; then the snapshot is created again
only for writing the logs to the db.
Also for reference my log file has this format:
{'status': 200, 'actual_user': 1, 'resource_name': '/core/logs/process', 'log_level': 'INFO', 'request_body': None, ... }
Thanks in advance :)

Python Cherrypy Access Log Rotation

If I want the access log for Cherrypy to only get to a fixed size, how would I go about using rotating log files?
I've already tried http://www.cherrypy.org/wiki/Logging, which seems out of date, or has information missing.
Look at http://docs.python.org/library/logging.html.
You probably want to configure a RotatingFileHandler
http://docs.python.org/library/logging.html#rotatingfilehandler
I've already tried http://www.cherrypy.org/wiki/Logging, which seems
out of date, or has information missing.
Try adding:
import logging
import logging.handlers
import cherrypy # you might have imported this already
and instead of
log = app.log
maybe try
log = cherrypy.log
The CherryPy documentation of the custom log handlers shows this very example.
Here is the slightly modified version that I use on my app:
import logging
from logging import handlers
def setup_logging():
log = cherrypy.log
# Remove the default FileHandlers if present.
log.error_file = ""
log.access_file = ""
maxBytes = getattr(log, "rot_maxBytes", 10000000)
backupCount = getattr(log, "rot_backupCount", 1000)
# Make a new RotatingFileHandler for the error log.
fname = getattr(log, "rot_error_file", "log\\error.log")
h = handlers.RotatingFileHandler(fname, 'a', maxBytes, backupCount)
h.setLevel(logging.DEBUG)
h.setFormatter(cherrypy._cplogging.logfmt)
log.error_log.addHandler(h)
# Make a new RotatingFileHandler for the access log.
fname = getattr(log, "rot_access_file", "log\\access.log")
h = handlers.RotatingFileHandler(fname, 'a', maxBytes, backupCount)
h.setLevel(logging.DEBUG)
h.setFormatter(cherrypy._cplogging.logfmt)
log.access_log.addHandler(h)
setup_logging()
Cherrypy does its logging using the standard Python logging module. You will need to change it to use a RotatingFileHandler. This handler will take care of everything for you including rotating the log when it reaches a set maximum size.

Categories