Python watch for file changes and identify user

Python watch for file changes and identify user - python

I need to produce a small script that will watch for accidental changes made by users to a large shared file structure.
I have found I can get the change events using the ReadDirectoryChanges API as per
http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html
However I cant see how I can identify the user account that has made the changes so that I can send out a notification.
Is it possible to get the name of the user account that moved the file/directory.

Tricky question, I'll get you an answer in two ways:
First, as optional part, you can to watch file modifications themselves, and add custom actions.
Example of file modification tracking, working on Windows / Linux / Mac / BSD
import time
import watchdog.events
import watchdog.observers
class StateHandler(watchdog.events.PatternMatchingEventHandler):
def on_modified(self, event):
print(event.event_type)
print(event.key)
print(event.src_path)
# Add your code here to do whatever you want on file modification
def on_created(self, event):
pass
def on_moved(self, event):
pass
def on_deleted(self, event):
pass
fs_event_handler = StateHandler()
fs_observer = watchdog.observers.Observer()
fs_observer.schedule(fs_event_handler, r'C:\Users\SomeUser\SomeFolder', recursive=True)
fs_observer.start()
try:
while True:
time.sleep(2)
except KeyboardInterrupt:
fs_observer.stop()
fs_observer.join()
Using the above filesystem observer, you can trigger security event log reviews.
You might also trigger them as scheduled task, but it's more fun to trigger them on file system modifications.
In order for security event logs to contain file modification information, you need to enable file auditing for the required directories using SACL lists (right click on your folder, security, auditing).
Then you can go through the security logs on file events.
Going through security logs can be done with windows_tools.
Get it installed with python -m pip install windows_tools.wmi_queries (obviously only works under Windows)
Then do the following:
from windows_tools.wmi_queries import *
result = query_wmi('SELECT * FROM Win32_NTLogEvent WHERE Logfile="Security" AND TimeGenerated > "{}"'.format(create_current_cim_timestamp(hour_offset=1)))
for r in result:
print(r)
You can add WHERE clauses like EventCode={integer} in order to filter only the events (file modifications or else) you need.
Usually the event codes you're searching are 4656, 4660, 4663, 4670 (open delete, edit, create).
See This microsoft article in order to know what WHERE clauses the event log class accepts.
DISCLAIMER: I'm the author of windows_tools package.

On an UNIX environment you can use os and pwd to retrieve the user account that has changed the file.
import os
import pwd
file_stat = os.stat("<changed_file>")
user_name = pwd.getpwuid(file_stat.st_uid).pw_name

Related

Dynamically switching log storage folder

I am using Python's logging to log execution of functions and other actions within an application. The log files are stored in a remote folder, which is accessible automatically when I connect to VPN (let's say \remote\directory). That is normal situation, 99% of the time there is a connection and log is stored without errors.
I need a solution for a situation when either the VPN connection or Internet connection is lost and the logs are temporarily stored locally. I think that on each time something is attempted to be logged, I need to run a check if the remote folder is accessible. I couldn't really find a solution, but I guess I need to modify the FileHandler somehow.
TLDR: You can already scroll down to blues' answer and my UPDATE section - there is my latest attempt to solve the issue.
Currently my handler is set like this:
log = logging.getLogger('general')
handler_error = logging.handlers.RotatingFileHandler(log_path+"\\error.log", 'a', encoding="utf-8")
log.addHandler(handler_error)
Here is a condition that sets the log path but only once - when logging is initialized. If I think correctly, I would like to run this condition each time the
if (os.path.isdir(f"\\\\remote\\folder\\")): # if remote is accessible
log_path = f"\\\\remote\\folder\\dev\\{d.year}\\{month}\\"
os.makedirs(os.path.dirname(log_path), exist_ok=True) # create this month dir if it does not exist, logging does not handle that
else: # if remote is not accesssible
log_path = f"localFiles\\logs\\dev\\{d.year}\\{month}\\"
log.debug("Cannot access the remote directory. Are you connected to the internet and the VPN?")
I have found a related thread, but was not able to adjust it to my own needs: Dynamic filepath & filename for FileHandler in logger config file in python
Should I dig deeper into custom Handler or is there some other way? Would be enough if I could call my own function that changed the logging path if needed (or change logger to one with a proper path) when logging is being executed.
UPDATE:
Per blues's answer, I have tried modifying a handler to suit my needs. Unfortunately, the code below, in which I try to switch baseFilename between local and remote paths, does not work. The logger always saves the log to local log file (that has been set while initializing logger). Thus, I think that my attempts to modify the baseFilename do not work?
class HandlerCheckBefore(RotatingFileHandler):
print("handler starts")
def emit(self, record):
calltime = date.today()
if os.path.isdir(f"\\\\remote\\Path\\"): # if remote is accessible
print("handler remote")
# create remote folders if not yet existent
os.makedirs(os.path.dirname(f"\\\\remote\\Path\\{calltime.year}\\{calltime.strftime('%m')}\\"), exist_ok=True)
if (self.level >= 20): # if error or above
self.baseFilename = f"\\\\remote\\Path\\{calltime.year}\\{calltime.strftime('%m')}\\error.log"
else:
self.baseFilename = f"\\\\remote\\Path\\{calltime.year}\\{calltime.strftime('%m')}\\{calltime.strftime('%d')}-{calltime.strftime('%m')}.log"
super().emit(record)
else: # save to local
print("handler local")
if (self.level >= 20): # error or above
self.baseFilename = f"localFiles\\logs\\{calltime.year}\\{calltime.strftime('%m')}\\error.log"
else:
self.baseFilename = f"localFiles\\logs\\{calltime.year}\\{calltime.strftime('%m')}\\{calltime.strftime('%d')}-{calltime.strftime('%m')}.log"
super().emit(record)
# init the logger
handler_error = HandlerCheckBefore(f"\\\\remote\\Path\\{calltime.year}\\{calltime.strftime('%m')}\\error.log", 'a', encoding="utf-8")
handler_error.setLevel(logging.ERROR)
handler_error.setFormatter(fmt)
log.addHandler(handler_error)

The best way to solve this is indeed to create a custom Handler for this. You can either check before each write that the directory is still there, or you could attempt to write the log and handle the resulting error in handleError which all loggers call when an exception occurs during emit(). I recommend the former. The code below shows how both could be implemented:
import os
import logging
from logging.handlers import RotatingFileHandler
class GrzegorzRotatingFileHandlerCheckBefore(RotatingFileHandler):
def emit(self, record):
if os.path.isdir(os.path.dirname(self.baseFilename)): # put appropriate check here
super().emit(record)
else:
logging.getLogger('offline').error('Cannot access the remote directory. Are you connected to the internet and the VPN?')
class GrzegorzRotatingFileHandlerHandleError(RotatingFileHandler):
def handleError(self, record):
logging.getLogger('offline').error('Something went wrong when writing log. Probably remote dir is not accessible')
super().handleError(record)
log = logging.getLogger('general')
log.addHandler(GrzegorzRotatingFileHandlerCheckBefore('check.log'))
log.addHandler(GrzegorzRotatingFileHandlerHandleError('handle.log'))
offline_logger = logging.getLogger('offline')
offline_logger.addHandler(logging.FileHandler('offline.log'))
log.error('test logging')

TensorFlow Serving: Update model_config (add additional models) at runtime

I'm busy configuring a TensorFlow Serving client that asks a TensorFlow Serving server to produce predictions on a given input image, for a given model.
If the model being requested has not yet been served, it is downloaded from a remote URL to a folder where the server's models are located. (The client does this). At this point I need to update the model_config and trigger the server to reload it.
This functionality appears to exist (based on https://github.com/tensorflow/serving/pull/885 and https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/model_service.proto#L22), but I can't find any documentation on how to actually use it.
I am essentially looking for a python script with which I can trigger the reload from client side (or otherwise to configure the server to listen for changes and trigger the reload itself).

So it took me ages of trawling through pull requests to finally find a code example for this. For the next person who has the same question as me, here is an example of how to do this. (You'll need the tensorflow_serving package for this; pip install tensorflow-serving-api).
Based on this pull request (which at the time of writing hadn't been accepted and was closed since it needed review): https://github.com/tensorflow/serving/pull/1065
from tensorflow_serving.apis import model_service_pb2_grpc
from tensorflow_serving.apis import model_management_pb2
from tensorflow_serving.config import model_server_config_pb2
import grpc
def add_model_config(host, name, base_path, model_platform):
channel = grpc.insecure_channel(host)
stub = model_service_pb2_grpc.ModelServiceStub(channel)
request = model_management_pb2.ReloadConfigRequest()
model_server_config = model_server_config_pb2.ModelServerConfig()
#Create a config to add to the list of served models
config_list = model_server_config_pb2.ModelConfigList()
one_config = config_list.config.add()
one_config.name= name
one_config.base_path=base_path
one_config.model_platform=model_platform
model_server_config.model_config_list.CopyFrom(config_list)
request.config.CopyFrom(model_server_config)
print(request.IsInitialized())
print(request.ListFields())
response = stub.HandleReloadConfigRequest(request,10)
if response.status.error_code == 0:
print("Reload sucessfully")
else:
print("Reload failed!")
print(response.status.error_code)
print(response.status.error_message)
add_model_config(host="localhost:8500",
name="my_model",
base_path="/models/my_model",
model_platform="tensorflow")

Add a model to TF Serving server and to the existing config file conf_filepath: Use arguments name, base_path, model_platform for the new model. Keeps the original models intact.
Notice a small difference from #Karl 's answer - using MergeFrom instead of CopyFrom
pip install tensorflow-serving-api
import grpc
from google.protobuf import text_format
from tensorflow_serving.apis import model_service_pb2_grpc, model_management_pb2
from tensorflow_serving.config import model_server_config_pb2
def add_model_config(conf_filepath, host, name, base_path, model_platform):
with open(conf_filepath, 'r+') as f:
config_ini = f.read()
channel = grpc.insecure_channel(host)
stub = model_service_pb2_grpc.ModelServiceStub(channel)
request = model_management_pb2.ReloadConfigRequest()
model_server_config = model_server_config_pb2.ModelServerConfig()
config_list = model_server_config_pb2.ModelConfigList()
model_server_config = text_format.Parse(text=config_ini, message=model_server_config)
# Create a config to add to the list of served models
one_config = config_list.config.add()
one_config.name = name
one_config.base_path = base_path
one_config.model_platform = model_platform
model_server_config.model_config_list.MergeFrom(config_list)
request.config.CopyFrom(model_server_config)
response = stub.HandleReloadConfigRequest(request, 10)
if response.status.error_code == 0:
with open(conf_filepath, 'w+') as f:
f.write(request.config.__str__())
print("Updated TF Serving conf file")
else:
print("Failed to update model_config_list!")
print(response.status.error_code)
print(response.status.error_message)

While the solutions mentioned here works fine, there is one more method that you can use to hot-reload your models. You can use --model_config_file_poll_wait_seconds
As mentioned here in the documentation -
By setting the --model_config_file_poll_wait_seconds flag to instruct the server to periodically check for a new config file at --model_config_file filepath.
So, you just have to update the config file at model_config_path and tf-serving will load any new models and unload any models removed from the config file.
Edit 1: I looked at the source code and it seems that the flag is present from the very early version of tf-serving but there have been instances where some users were not able to use this flag (see this). So, try to use the latest version if possible.

If you're using the method described in this answer, please note that you're actually launching multiple tensorflow model server instances instead of a single model server, effectively making the servers compete for resources instead of working together to optimize tail latency.

Changes not written to file correctly in Python 2.7

For a few days now, I have been struggling with a problem, namely that the settings written by my settings class for a parser are not persistent when the program gets restarted. This problem occurs only on Windows, but in both Python x86 and x64 environments and when compiled using PyInstaller. It also does not matter whether the program is run as Administrator or not.
When the program first runs, write_def(self) is called by the constructor. This function writers teh defaults correctly to the file specified. After this, read_set(self) is called so the class variables are set.These class variables then do match the default values.
In another file, namely frames.py, write_set(self) is called, and all settings are passed as arguments. Using print statements I have asserted that the write_set(self) function receives the correct values. No errors occur when writing the settings to the file, and when running read_set(self) again, the new settings are read correctly and this is also shown in the GUI.
However, when closing the program and running it again, the default settings are again shown. This is not behaviour I expected.
Below I have added the settings class implementing a cPickle. When using pickle the behaviour is the same. When using shelve as in this file, the behaviour is the same. When using dill, the behaviour is the same. When implementing a ConfigParser.RawConfigParser (in the configparser branch of the GitHub repository linked to earlier), the behaviour is the same, and additionally when viewing the settings file in a text editor it is visible that the settings in the file are not updated.
When running the same code on Linux (Ubuntu 16.04.1 LTS with Python 2.7), everything works as expected with pickle and shelve versions. The settings are correctly saved and loaded from the file. Am I doing something wrong? Is it a Windows-specific issue with Python?
Thank you in advance for any help!
RedFantom.
# Written by RedFantom, Wing Commander of Thranta Squadron and Daethyra, Squadron Leader of Thranta Squadron
# Thranta Squadron GSF CombatLog Parser, Copyright (C) 2016 by RedFantom and Daethyra
# For license see LICENSE
# UI imports
import tkMessageBox
# General imports
import getpass
import os
import cPickle
# Own modules
import vars
# Class with default settings for in the settings file
class defaults:
# Version to display in settings tab
version = "2.0.0_alpha"
# Path to get the CombatLogs from
cl_path = 'C:/Users/' + getpass.getuser() + "/Documents/Star Wars - The Old Republic/CombatLogs"
# Automatically send and retrieve names and hashes of ID numbers from the remote server
auto_ident = str(False)
# Address and port of the remote server
server = ("thrantasquadron.tk", 83)
# Automatically upload CombatLogs as they are parsed to the remote server
auto_upl = str(False)
# Enable the overlay
overlay = str(True)
# Set the overlay opacity, or transparency
opacity = str(1.0)
# Set the overlay size
size = "big"
# Set the corner the overlay will be displayed in
pos = "TL"
# Set the defaults style
style = "plastik"
# Class that loads, stores and saves settings
class settings:
# Set the file_name for use by other functions
def __init__(self, file_name = "settings.ini"):
self.file_name = file_name
# Set the install path in the vars module
vars.install_path = os.getcwd()
# Check for the existence of the specified settings_file
if self.file_name not in os.listdir(vars.install_path):
print "[DEBUG] Settings file could not be found. Creating a new file with default settings"
self.write_def()
self.read_set()
else:
try:
self.read_set()
except:
tkMessageBox.showerror("Error", "Settings file available, but it could not be read. Writing defaults.")
self.write_def()
vars.path = self.cl_path
# Read the settings from a file containing a pickle and store them as class variables
def read_set(self):
with open(self.file_name, "r") as settings_file_object:
settings_dict = cPickle.load(settings_file_object)
self.version = settings_dict["version"]
self.cl_path = settings_dict["cl_path"]
self.auto_ident = settings_dict["auto_ident"]
self.server = settings_dict["server"]
self.auto_upl = settings_dict["auto_upl"]
self.overlay = settings_dict["overlay"]
self.opacity = settings_dict["opacity"]
self.size = settings_dict["size"]
self.pos = settings_dict["pos"]
self.style = settings_dict["style"]
# Write the defaults settings found in the class defaults to a pickle in a file
def write_def(self):
settings_dict = {"version":defaults.version,
"cl_path":defaults.cl_path,
"auto_ident":bool(defaults.auto_ident),
"server":defaults.server,
"auto_upl":bool(defaults.auto_upl),
"overlay":bool(defaults.overlay),
"opacity":float(defaults.opacity),
"size":defaults.size,
"pos":defaults.pos,
"style":defaults.style
}
with open(self.file_name, "w") as settings_file:
cPickle.dump(settings_dict, settings_file)
# Write the settings passed as arguments to a pickle in a file
# Setting defaults to default if not specified, so all settings are always written
def write_set(self, version=defaults.version, cl_path=defaults.cl_path,
auto_ident=defaults.auto_ident, server=defaults.server,
auto_upl=defaults.auto_upl, overlay=defaults.overlay,
opacity=defaults.opacity, size=defaults.size, pos=defaults.pos,
style=defaults.style):
settings_dict = {"version":version,
"cl_path":cl_path,
"auto_ident":bool(auto_ident),
"server":server,
"auto_upl":bool(auto_upl),
"overlay":bool(overlay),
"opacity":float(opacity),
"size":str(size),
"pos":pos,
"style":style
}
with open(self.file_name, "w") as settings_file_object:
cPickle.dump(settings_dict, settings_file_object)
self.read_set()

Sometimes it takes a while to get to an answer, and I just thought of this: What doesn't happen on Linux that does happen on Windows? The answer to that question is: Changing the directory to the directory of the files being parsed. And then it becomes obvious: the settings are stored correctly, but the folder where the settings file is created is changed during the program, so the settings don't get written to the original settings file, but a new settings file is created in another location.

Link generator using django or any python module

I want to generate for my users temporary download link.
Is that ok if i use django to generate link using url patterns?
Could it be correct way to do that. Because can happen that I don't understand some processes how it works. And it will overflow my memory or something else. Some kind of example or tools will be appreciated. Some nginx, apache modules probably?
So, what i wanna to achieve is to make url pattern which depend on user and time. Decript it end return in view a file.

A simple scheme might be to use a hash digest of username and timestamp:
from datetime import datetime
from hashlib import sha1
user = 'bob'
time = datetime.now().isoformat()
plain = user + '\0' + time
token = sha1(plain)
print token.hexdigest()
"1e2c5078bd0de12a79d1a49255a9bff9737aa4a4"
Next you store that token in a memcache with an expiration time. This way any of your webservers can reach it and the token will auto-expire. Finally add a Django url handler for '^download/.+' where the controller just looks up that token in the memcache to determine if the token is valid. You can even store the filename to be downloaded as the token's value in memcache.

Yes it would be ok to allow django to generate the urls. This being exclusive from handling the urls, with urls.py. Typically you don't want django to handle the serving of files see the static file docs[1] about this, so get the notion of using url patterns out of your head.
What you might want to do is generate a random key using a hash, like md5/sha1. Store the file and the key, datetime it's added in the database, create the download directory in a root directory that's available from your webserver like apache or nginx... suggest nginx), Since it's temporary, you'll want to add a cron job that checks if the time since the url was generated has expired, cleans up the file and removes the db entry. This should be a django command for manage.py
Please note this is example code written just for this and not tested! It may not work the way you were planning on achieving this goal, but it works. If you want the dl to be pw protected also, then look into httpbasic auth. you can generate and remove entries on the fly in a httpd.auth file using htpasswd and the subprocess module when you create the link or at registration time.
import hashlib, random, datetime, os, shutil
# model to hold link info. has these fields: key (charfield), filepath (filepathfield)
# datetime (datetimefield), url (charfield), orgpath (filepathfield of the orignal path
# or a foreignkey to the files model.
from models import MyDlLink
# settings.py for the app
from myapp import settings as myapp_settings
# full path and name of file to dl.
def genUrl(filepath):
# create a onetime salt for randomness
salt = ''.join(['{0}'.format(random.randrange(10) for i in range(10)])
key = hashlib('{0}{1}'.format(salt, filepath).hexdigest()
newpath = os.path.join(myapp_settings.DL_ROOT, key)
shutil.copy2(fname, newpath)
newlink = MyDlink()
newlink.key = key
newlink.date = datetime.datetime.now()
newlink.orgpath = filepath
newlink.newpath = newpath
newlink.url = "{0}/{1}/{2}".format(myapp_settings.DL_URL, key, os.path.basename(fname))
newlink.save()
return newlink
# in commands
def check_url_expired():
maxage = datetime.timedelta(days=7)
now = datetime.datetime.now()
for link in MyDlink.objects.all():
if(now - link.date) > maxage:
os.path.remove(link.newpath)
link.delete()
[1] http://docs.djangoproject.com/en/1.2/howto/static-files/

It sounds like you are suggesting using some kind of dynamic url conf.
Why not forget your concerns by simplifying and setting up a single url that captures a large encoded string that depends on user/time?
(r'^download/(?P<encrypted_id>(.*)/$', 'download_file'), # use your own regexp
def download_file(request, encrypted_id):
decrypted = decrypt(encrypted_id)
_file = get_file(decrypted)
return _file
A lot of sites just use a get param too.
www.example.com/download_file/?09248903483o8a908423028a0df8032
If you are concerned about performance, look at the answers in this post: Having Django serve downloadable files
Where the use of the apache x-sendfile module is highlighted.
Another alternative is to simply redirect to the static file served by whatever means from django.

Determine if current user is in Administrators group (Windows/Python)

I want certain functions in my application to only be accessible if the current user is an administrator.
How can I determine if the current user is in the local Administrators group using Python on Windows?

You could try this:
import ctypes
print ctypes.windll.shell32.IsUserAnAdmin()

import win32net
def if_user_in_group(group, member):
members = win32net.NetLocalGroupGetMembers(None, group, 1)
return member.lower() in list(map(lambda d: d['name'].lower(), members[0]))
# Function usage
print(if_user_in_group('SOME_GROUP', 'SOME_USER'))
Of course in your case 'SOME_GROUP' will be 'administrators'

I'd like to give some credit to Vlad Bezden, becuase without his use of the win32net module this answer here would not exist.
If you really want to know if the user has the ability to act as an admin past UAC you can do the following. It also lists the groups the current user is in, if needed. It will work on most (all?) language set-ups. The local group just hast to start with "Admin", which it usually does... (Does anyone know if some set-ups will be different?)
To use this code snippet you'll need to have pywin32 module installed,
if you don't have it yet you can get it from PyPI: pip install pywin32
IMPORTANT TO KNOW:
It may be important to some users / coders that the function os.getlogin() is only available since python3.1 on Windows operating systems...
python3.1 Documentation
win32net Reference
from time import sleep
import os
import win32net
if 'logonserver' in os.environ:
server = os.environ['logonserver'][2:]
else:
server = None
def if_user_is_admin(Server):
groups = win32net.NetUserGetLocalGroups(Server, os.getlogin())
isadmin = False
for group in groups:
if group.lower().startswith('admin'):
isadmin = True
return isadmin, groups
# Function usage
is_admin, groups = if_user_is_admin(server)
# Result handeling
if is_admin == True:
print('You are a admin user!')
else:
print('You are not an admin user.')
print('You are in the following groups:')
for group in groups:
print(group)
sleep(10)
# (C) 2018 DelphiGeekGuy#Stack Overflow
# Don't hesitate to credit the author if you plan to use this snippet for production.
Oh and WHERE from time import sleep and sleep(10):
INSERT own imports/code...

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python watch for file changes and identify user - python

On an UNIX environment you can use os and pwd to retrieve the user account that has changed the file. import os import pwd file_stat = os.stat("<changed_file>") user_name = pwd.getpwuid(file_stat.st_uid).pw_name

Related

Dynamically switching log storage folder

TensorFlow Serving: Update model_config (add additional models) at runtime

Changes not written to file correctly in Python 2.7

Link generator using django or any python module

Determine if current user is in Administrators group (Windows/Python)

Categories

Resources