Detect files changes in python - python

I wanted to detect changes in certain directories, and if I heard changes in the files, for example, many changes at the same time it would return some print and perform certain actions. (I'm programming for linux)
import sys
import time
import logging
from watchdog.observers import Observer
from watchdog.events import LoggingEventHandler
if __name__ == '__main__':
# Set the format for logging info
logging.basicConfig(level=logging.INFO,
format='%(asctime)s - %(message)s',
datefmt='%Y-%m-%d %H:%M:%S')
# Set format for displaying path
#path = sys.argv[1] if len(sys.argv) > 1 else '.'
path = '/home/kali/Downloads'
event_handler = LoggingEventHandler()
observer = Observer()
observer.schedule(event_handler, path, recursive=True)
observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
I'm using this base code to perform the actions, but how can I put the conditions of several changes using this base?

Related

Why python watchdog cannot monitor some folder?

I am trying to use the ''watchdog'' package in python to monitor the change of my folder.
The following code are copied directly from the watchdog documentation, and it works fine.
import sys
import logging
from watchdog.observers import Observer
from watchdog.events import LoggingEventHandler
if __name__ == '__main__':
logging.basicConfig(level=logging.INFO,
format='%(asctime)s - %(message)s',
datefmt='%Y-%m-%d %H:%M:%S')
path = '.'
event_handler = LoggingEventHandler()
observer = Observer()
observer.schedule(event_handler, path, recursive=False)
observer.start()
try:
while observer.is_alive():
observer.join(1)
finally:
observer.stop()
observer.join()
However, when I tried to change the "path" variable, the code might not work.
For example, if path='./build/', I can see the changes in './build/' folder as expected, but if path='./build/result/', the terminal prints nothing no matter what changes I make to the corresponding folder.
I don't understand why, and I have to ask for help.
I am using ubuntu-20.04LTS through WSL2.

Python Watchdog with Slurm Output

I'm trying to use python-watchdog to monitor output of SLURM jobs on a supercomputer. For some reason, the watchdog program isn't detecting changes in the files, even if a tail -f shows that the file is indeed being changed. Here's my watchdog program:
import logging
import socket
import sys
import time
from watchdog.observers import Observer
from watchdog.events import PatternMatchingEventHandler
logging.basicConfig(level=logging.INFO,
format='%(asctime)s - %(message)s',
datefmt='%Y-%m-%d %H:%M:%S',
filename="/work/ollie/pgierz/PISM/pindex_vostok_ds50/scripts/pindex_vostok_ds50.watchdog")
def on_created(event):
logging.info(f"hey, {event.src_path} has been created!")
def on_deleted(event):
logging.info(f"what the f**k! Someone deleted {event.src_path}!")
def on_modified(event):
logging.info(f"hey buddy, {event.src_path} has been modified")
def on_moved(event):
logging.info(f"ok ok ok, someone moved {event.src_path} to {event.dest_path}")
if __name__ == "__main__":
if "ollie" in socket.gethostname():
logging.info("Not watching on login node...")
sys.exit()
# Only do this on compute node:
patterns = "*"
ignore_patterns = "*.watchdog"
ignore_directories = False
case_sensitive = True
my_event_handler = PatternMatchingEventHandler(
patterns, ignore_patterns, ignore_directories, case_sensitive
)
my_event_handler.on_created = on_created
my_event_handler.on_deleted = on_deleted
my_event_handler.on_modified = on_modified
my_event_handler.on_moved = on_moved
path = "/work/ollie/pgierz/PISM/pindex_vostok_ds50/scripts"
#path = "/work/ollie/pgierz/PISM/pindex_vostok_ds30/"
go_recursively = True
my_observer = Observer()
my_observer.schedule(my_event_handler, path, recursive=go_recursively)
my_observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
my_observer.stop()
my_observer.join()
This is just a suspicion, but could it be that the filesystem doesn't actually register the file as being "changed" since it is still open from the batch job? Doing an ls -l or stat on the output files shows it was "modified" when the job started. Do I need to tell slurm to "flush" the file?

Store logging info as variable to use in email message alert

I'm only a couple weeks in to learning python with no previous programming background so I apologize for my ignorance..
I'm trying to use a combination of modules to monitor a folder for new files (watchdog), alert on any event (logging module), and then ship the alert to my email (smtplib).
I've found a really good example here: How to run an function when anything changes in a dir with Python Watchdog?
However, I'm stuck trying to save the logging info as a variable to use in my email message. I'm wondering if I'll need to output the logging information to a file and then read in the line to use as a variable.
Anyways, this is what I have. Any help is appreciated. In the meantime I'll continue to Google.
import sys
import time
import logging
from watchdog.observers import Observer
from watchdog.events import LoggingEventHandler
import smtplib
class Event(LoggingEventHandler):
def on_any_event(self, event):
logMsg = logging.basicConfig(level=logging.INFO,
format='%(asctime)s - %(message)s',
datefmt='%Y-%m-%d %H:%M:%S')
sender = 'NoReply#myDomain.com'
receiver = 'test.user#myDomain.com'
message = """From: No Reply <NoReply#myDomain.com>
TO: Test User <test.user#myDomain.com>
Subject: Folder Modify Detected
The following change was detected: """ + str(logMsg)
mail = smtplib.SMTP('mailServer.myDomain.com', 25)
mail.ehlo()
mail.starttls()
mail.sendmail(sender, receiver, message)
mail.close()
if __name__ == "__main__":
path = sys.argv[1] if len(sys.argv) > 1 else '.'
event_handler = Event()
observer = Observer()
observer.schedule(event_handler, path, recursive=False)
observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
What you need is a SMTPHandler, so that every time the folder changes (new log created), an email is sent.
import logging
from watchdog.observers import Observer
from watchdog.events import LoggingEventHandler
class Event(LoggingEventHandler):
def on_any_event(self, event):
# do stuff
pass
if __name__ == "__main__":
root = logging.getLogger()
root.setLevel(logging.INFO)
formatter = logging.Formatter('%(asctime)s - %(message)s',
'%Y-%m-%d %H:%M:%S')
root.setFormatter(formatter)
mail_handler = logging.handlers.SMTPHandler(mailhost='mailserver',
fromaddr='noreply#example.com',
toaddrs=['me#example.com'],
subject='The log',
credentials=('user','pwd'),
secure=None)
mail_handler.setLevel(logging.INFO)
mail_handler.setFormatter(formatter)
root.addHandler(mail_handler) # Add this handler to root logger
path = sys.argv[1] if len(sys.argv) > 1 else '.'
event_handler = Event()
observer = Observer()
observer.schedule(event_handler, path, recursive=False)
observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
Was able to obtain working example here and tailor to my needs:
https://www.michaelcho.me/article/using-pythons-watchdog-to-monitor-changes-to-a-directory

Combination of python-daemon and watchdog

I try to get run the following code snippet. As soon as I daemonize the app, it doesn't watch for new incoming pcap files anymore.
#!/usr/bin/env pypy
import daemon
import watchdog
import logging
import logging.handlers
from logging import info, debug, warn, error
from watchdog.observers import Observer
from watchdog.events import PatternMatchingEventHandler
from time import sleep
def main():
def on_created_pcap(event):
error(event.src_path)
handler = logging.handlers.SysLogHandler("/var/run/syslog")
formatter = "%(filename)s: [%(levelname)s] %(message)s"
handler.setFormatter(logging.Formatter(formatter))
logging.getLogger().addHandler(handler)
pattern = ['*.pcap']
event_handler = watchdog.events.PatternMatchingEventHandler(pattern)
event_handler.on_created = on_created_pcap
observer = Observer()
observer.schedule(event_handler, "./")
observer.start()
while True:
sleep(1)
with daemon.DaemonContext():
main()
W/o the daemon context it works. Any ideas?
NB: I run the snippet on OS X. I you try it on Linux box you need to replace /var/run/syslog in the SysLogHandler w/ ('localhost', 514).

How to match only particular events with Python watchdog

I'm intending to use Python watchdog to handle a directory where files are written to,
and I'm only interested in image files, trouble is I dont quite grok the code at this page.
This is my attempt:
from watchdog.observers import Observer
from watchdog.events import PatternMatchingEventHandler
class Beat(PatternMatchingEventHandler):
def on_create(self,event):
print event.src_path
if __name__ == "__main__":
patt = ['\w+[.]jpeg']
event_handler = Beat(patterns=patt,ignore_directories=True,)
observer = Observer()
path = "./"
observer.schedule(event_handler, path, recursive=True)
observer.start()
I'm trying to use the pattern matching class, but I'm getting nothing. How is it supposed to be used?
Based on the source code, fnmatch is being used under the hood. fnmatch can only do UNIX glob-style pattern matching. Which means you may have better luck with *.jpg than \w+[.]jpeg
You can actually use the RegexMatchingEventHandler instead of PatternMatchingEventHandler to accomplish exactly what you want to do:
from watchdog.observers import Observer
from watchdog.events import RegexMatchingEventHandler
class ExampleHandler(RegexMatchingEventHandler):
def on_create(self, event):
print(event.src_path)
if __name__ == "__main__":
pattern = '\w+\.jpeg'
event_handler = ExampleHandler(regexes=[pattern], ignore_directories=True)
observer = Observer()
path = "./"
observer.schedule(event_handler, path, recursive=True)
observer.start()
import time
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()

Categories