I have written a script that observes a directory for creation of new files. I set up a split function to split the event.src_path from the target directory that i was giving the observer. This allowed me to get the file_name successfully.
See script below
def on_created(event):
source_path = event.src_path
file_name = source_path.split(TargetDir,1)[1]
print(f"{file_name} was just Created")
if __name__ == "__main__":
for dir in range(len(TargetDir)):
event_handler = FileSystemEventHandler()
event_handler.on_created = on_created
observer = Observer()
observer.schedule(event_handler, path = TargetDir[0], recursive=True)
observer.start()
However, now i am trying to feed in a list of Target directories and am looping through each one and calling the on_created() method. Now obviously the Target directory is no longer a global variable, and i need to try to pass each Dir in to the function. Im using watchdog, and don't think it's possible to add extra arguments to the on_created() function. If i'm wrong, please let me know how to do this? otherwise, is there no simpler way to just get the name of the file that was created, without passing in the target directory just for that reason? I can get the event.src, however this gives the full path, and then i wouldn't know where to split it, if it were scanning multiple directories.
Well one simple way is to pass in a different function for each directory, for example :
def create_callback(dir):
def on_create(event):
source_path = event.src_path
file_name = source_path.split(dir, 1)[1]
print(f"{file_name} was just Created")
return on_create
if __name__ == "__main__":
for dir in range(len(TargetDir)):
event_handler = FileSystemEventHandler()
event_handler.on_created = create_callback(dir)
observer = Observer()
observer.schedule(event_handler, path=TargetDir[0], recursive=True)
observer.start()
The dir variable is attached to the scope of the on_create function and can therefore be used from within the function.
Related
I am working on a python script, where I will be passing a directory, and I need to get all log-files from it. Currently, I have a small script which watches for any changes to these files and then processes that information.
It's working good, but it's just for a single file, and hardcoded file value. How can I pass a directory to it, and still watch all the files. My confusion is since I am working on these files in a while loop which should always stay running, how can I do that for n number of files inside a directory?
Current code :
import time
f = open('/var/log/nginx/access.log', 'r')
while True:
line = ''
while len(line) == 0 or line[-1] != '\n':
tail = f.readline()
if tail == '':
time.sleep(0.1) # avoid busy waiting
continue
line += tail
print(line)
_process_line(line)
Question was already tagged for duplicate, but the requirement is to get changes line by line from all files inside directory. Other questions cover single file, which is already working.
Try this library: watchdog.
Python API library and shell utilities to monitor file system events.
https://pythonhosted.org/watchdog/
Simple example:
import sys
import time
import logging
from watchdog.observers import Observer
from watchdog.events import LoggingEventHandler
if __name__ == "__main__":
logging.basicConfig(level=logging.INFO,
format='%(asctime)s - %(message)s',
datefmt='%Y-%m-%d %H:%M:%S')
path = sys.argv[1] if len(sys.argv) > 1 else '.'
event_handler = LoggingEventHandler()
observer = Observer()
observer.schedule(event_handler, path, recursive=True)
observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
I'm not completely sure if I understand what your trying to do, but maybe use:
while True:
files = os.listdir(directory)
for file in files:
--You're code for checking contents of the file--
I am making a python script to change the name of a file in a folder to the same name of the folder.
For example if a folder is called TestFolder and the txt file in the folder is called test, the script will make the file called TestFolder.txt.
But, how can make the script work outside of the directory it is located in?
Beneath is my code so far, i hope i explained it good enough.
import os
temp = os.path.dirname(os.path.realpath(__file__))
src = "{temp}\\".format(temp=temp)
def renamer():
path = os.path.dirname(src)
folder = os.path.basename(path)
os.rename("{directory}\\{file}".format(directory=src, file=listDir()),
"{directory}\\{file}.txt".format(directory=src, file=folder))
def listDir():
for file in os.listdir(src):
if file.endswith(".txt"):
return file
def main():
print("Hello World")
print(listDir())
renamer()
print(listDir())
if __name__ == "__main__":
main()
Your problem is that you went to some trouble to specify the script location as the renaming path:
temp = os.path.dirname(os.path.realpath(__file__))
src = "{temp}\\".format(temp=temp)
def renamer():
path = os.path.dirname(src)
folder = os.path.basename(path)
The solution is simple: if you don't want the script's location as the path/folder, then don't do that. Put what you want in its place. Use the cwd (current working directory) to rename in the execution location; otherwise, re-code your program to accept a folder name as input. Either of these is readily available through many examples on line.
I am writing a piece of code to recursively processing *.py files. The code block is as the following:
class FileProcessor(object):
def convert(self,file_path):
if os.path.isdir(file_path):
""" If the path is a directory,then process it recursively
untill a file is met"""
dir_list=os.listdir(file_path)
print("Now Processing Directory:",file_path)
i=1
for temp_dir in dir_list:
print(i,":",temp_dir)
i=i+1
self.convert(temp_dir)
else:
""" if the path is not a directory"""
""" TODO something meaningful """
if __name__ == '__main__':
tempObj=FileProcessor()
tempObj.convert(sys.argv[1])
When I run the script with a directory path as argument, it only runs the first layer of the directory, the line:
self.convert(temp_dir)
seems never get called. I'm using Python 3.5.
The recursion is happening fine, but temp_dir is not a directory so it passes control to your stub else block. You can see this if you put print(file_path) outside your if block.
temp_dir is the name of the next directory, not its absolute path. "C:/users/adsmith/tmp/folder" becomes just "folder". Use os.path.abspath to get that
self.convert(os.path.abspath(temp_dir))
Although the canonical way to do this (as mentioned in my comment on the question) is to use os.walk.
class FileProcessor(object):
def convert(self, file_path):
for root, dirs, files in os.walk(file_path):
# if file_path is C:/users/adsmith, then:
# root == C:/users/adsmith
# dirs is an iterator of each directory in C:/users/adsmith
# files is an iterator of each file in C:/users/adsmith
# this walks on its own, so your next iteration will be
# the next deeper directory in `dirs`
for i, d in enumerate(dirs):
# this is also preferred to setting a counter var and incrementing
print(i, ":", d)
# no need to recurse here since os.walk does that itself
for fname in files:
# do something with the files? I guess?
As temp_dir has the filename only without parent path, you should change
self.convert(temp_dir)
to
self.convert(os.path.join(file_path, temp_dir))
I just started working with the Watchdog library in Python on Mac, and am doing some basic tests to make sure things are working like I would expect. Unfortunately, they're not -- I can only seem to obtain the path to the folder containing the file where an event was registered, not the path to the file itself.
Below is a simple test program (slightly modified from the example provided by Watchdog) to print out the event type, path, and time whenever an event is registered.
import time
from watchdog.observers import Observer
from watchdog.events import LoggingEventHandler
from watchdog.events import FileSystemEventHandler
class TestEventHandler(FileSystemEventHandler):
def on_any_event(self, event):
print("event noticed: " + event.event_type +
" on file " + event.src_path + " at " + time.asctime())
if __name__ == "__main__":
event_handler = TestEventHandler()
observer = Observer()
observer.schedule(event_handler, path='~/test', recursive=True)
observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
The src_path variable should contain the path of the file that had the event happen to it.
However, in my testing, when I modify a file, src_path only prints the path to the folder containing the file, not the path to the file itself. For example, when I modify the file moon.txt in the folder europa, the program prints the following output:
event noticed: modified on file ~/test/europa at Mon Jul 8 15:32:07 2013
What do I need to change in order to obtain the full path to the modified file?
Problem solved. As it turns out, FSEvents in OS X returns only the directory for file modified events, leaving you to scan the directory yourself to find out which file was modified. This is not mentioned in Watchdog documentation, though it's found easily in FSEvents documentation.
To get the full path to the file, I added the following snippet of code (inspired by this StackOverflow thread) to find the most recently modified file in a directory, to be used whenever event.src_path returns a directory.
if(event.is_directory):
files_in_dir = [event.src_path+"/"+f for f in os.listdir(event.src_path)]
mod_file_path = max(files_in_dir, key=os.path.getmtime)
mod_file_path contains the full path to the modified file.
Thanks ekl for providing your solution. I just stumbled across the same problem. However, I used to use PatternMatchingEventHandler, which requires small changes to your solution:
subclass from FileSystemEventHandler
create an attribute pattern where you store your pattern matching. This is not as flexible as the original PatternMatchingEventHandler, but should suffice most needs, and you will get the idea anyway if you want to extend it.
Here's the code you have to put in your FileSystemEventHandlersubclass:
def __init__(self, pattern='*'):
super(MidiEventHandler, self).__init__()
self.pattern = pattern
def on_modified(self, event):
super(MidiEventHandler, self).on_modified(event)
if event.is_directory:
files_in_dir = [event.src_path+"/"+f for f in os.listdir(event.src_path)]
if len(files_in_dir) > 0:
modifiedFilename = max(files_in_dir, key=os.path.getmtime)
else:
return
else:
modifiedFilename = event.src_path
if fnmatch.fnmatch(os.path.basename(modifiedFilename), self.pattern):
print "Modified MIDI file: %s" % modifiedFilename
One other thing I changed is that I check whether the directory is empty or not before running max() on the file list. max() does not work with empty lists.
I'm currently putting together a script in Python which will do the following:-
Create a directory in my Dropbox folder called 'Spartacus'
Create a subdirectory in this location with the naming convention of the date and time of creation
Within this directory, create a file called iprecord.txt and information will then be written to this file.
Here is my code thusfar using Python v2.7 on Windows 7:-
import os
import time
import platform
import urllib
current_dir = os.getcwd()
targetname = "Spartacus"
target_dir = os.path.join(current_dir, targetname)
timenow = time.strftime("\%d-%b-%Y %H-%M-%S")
def directoryVerification():
os.chdir(current_dir)
try:
os.mkdir('Spartacus')
except OSError:
pass
try:
os.system('attrib +h Spartacus')
except OSError:
pass
def gatherEvidence():
os.chdir(target_dir)
try:
evidential_dir = os.mkdir(target_dir + timenow)
os.chdir(evidential_dir)
except OSError:
pass
f = iprecord.txt
with f as open:
ip_addr = urllib.urlopen('http://www.biranchi.com/ip.php').read()
f.write("IP Address:\t %s\t %s" % ip_addr, time.strftime("\%d-%b-%Y %H-%M-%S"))
x = directoryVerification()
y = gatherEvidence()
I keep on getting an error in line 26 whereby it cannot resolve the full path to the dynamically named directory (date and time) one. I've printed out the value of 'evidential_dir' and it shows as being Null.
Any pointers as to where I am going wrong? Thanks
PS: Any other advice on my code to improve it would be appreciated
PPS: Any advice on how to locate the default directory for 'Dropbox'? Is there a way of scanning a file system for a directory called 'Dropbox' and capturing the path?
os.mkdir() does not return a pathname as you might be thinking. It seems like you do inconsistent methods of the same thing at different spots of your code.
Try this:
evidential_dir = os.path.join(target_dir, timenow)
os.mkdir(evidential_dir)
And fix your other line:
f = "iprecord.txt"
os.mkdir doesn't return anything.
evidential_dir = target_dir + timenow
try:
os.mkdir(evidential_dir)
except OSError:
pass
os.chdir(evidential_dir)