I'm using Watchdog to watch a directory for new .xml files being downloaded via ftplib on a time interval. When Watchdog see the file, on_created() triggers a function to process/parse the xml, but it seems that the file download hasn't completed yet causing an missing data error in the subsequent function.
I've added a time.sleep(1) before function is called which has alleviated the error, but adding a delay seems like an unreliable method in the real world. I'm wondering if there's a method similar to a promise function I can use vs. a delay. Or maybe I've completely misdiagnosed the issue and there's a simple answer? Open to any suggestion.
FYI... the files sizes can vary from roughly 100K to 4-5mg.
FTP Function
def download(f):
ftpt = ftplib.FTP(server)
ftpt.login(username, password)
ftpt.cwd(ftp_dir)
print 'Connected to FTP directory'
if f.startswith('TLC-EMAILUPDATE'):
if os.path.exists(dl_dir + f) == 0:
fhandle = open(os.path.join(dl_dir, f), 'wb')
print 'Getting ' + f
ftpt.retrbinary('RETR ' + f, fhandle.write)
fhandle.close()
elif os.path.exists(dl_dir + f) == 1:
print 'File', f, 'Already Exists, Skipping Download'
ftp = ftplib.FTP(server)
ftp.login(username, password)
ftp.cwd(ftp_dir)
infiles = ftp.nlst()
pool = Pool(4)
pool.map(download, in files)
Watchdog
def on_created(self, event):
self.processfile(event)
base = os.path.basename(event.src_path)
if base.startswith('TLC-EMAILUPDATE'):
print 'File for load report has been flagged'
xmldoc = event.src_path
if os.path.isfile(xmldoc) == 1:
print 'File download complete'
send_email(xmldoc)
Send Mail (with sleep)
The exception is thrown at the content variable where the parsing fails to read any data from the downloaded file.
def send_email(xmldoc):
time.sleep(2)
content = str(parse_xml.create_template(xmldoc))
msg = MIMEText(content, TEXT_SUBTYPE)
msg['Subject'] = EMAIL_SUBJECT
msg['From'] = EMAIL_SENDER
msg['To'] = listToStr(EMAIL_RECEIVERS)
try:
smtpObj = SMTP(GMAIL_SMTP, GMAIL_SMTP_PORT)
smtpObj.ehlo()
smtpObj.starttls()
smtpObj.ehlo()
smtpObj.login(user=EMAIL_SENDER, password=EMAIL_PASS)
smtpObj.sendmail(EMAIL_SENDER, EMAIL_RECEIVERS, msg.as_string())
smtpObj.quit()
print 'Email has been sent to %s' % EMAIL_RECEIVERS
except SMTPException as error:
print 'Error: unable to send email : {err}'.format(err=error)
Simple answer: switch to monitoring the CLOSE_WRITE event. Alas Watchdog doesn't support it directly. Either:
1) switch to pyinotify and use the following code -- Linux only, not OSX
2) use Watchdog with on_any_event()
pyinotify example source
import os, sys
import pyinotify
class VideoComplete(pyinotify.ProcessEvent):
def process_IN_CLOSE_WRITE(self, event):
sys.stdout.write(
'video complete: {}\n'.format(event.pathname)
)
sys.stdout.flush()
def main():
wm = pyinotify.WatchManager()
notifier = pyinotify.Notifier(
wm, default_proc_fun=VideoComplete(),
)
mask = pyinotify.ALL_EVENTS
path = os.path.expanduser('~/Downloads/incoming')
wm.add_watch(path, mask, rec=True, auto_add=True)
notifier.loop()
if __name__=='__main__':
main()
download a file
echo beer > ~/Downloads/incoming/beer.txt
output
video complete: /home/johnm/Downloads/incoming/beer.txt
Related
I'm running a PowerShell script through Python that sends an email if a condition is met.
However, I'm running into a logical error when scheduling this program to run with Windows Scheduler. The program runs with no errors (it seems) but I never receive an email, and scheduler says "task run successfully".
However, if I run it manually through Python IDE, it runs well and I get an email. The Python script itself doesn't seem to be the problem, it seems to be more something going on with the PowerShell portion.
Is there a way to log in a text file or anything the PowerShell error code through Python so I can see what is happening in the background?
Below is my full code:
import glob, os, time, subprocess, sys
from datetime import datetime
import datetime
from plyer import notification
print("Program has started.")
print("Analyzing contents of folder. This will take some time")
os.system('mode con: cols=40 lines=10')
path_to_watch = '//...Alert/Testing Folder/'
print("Right before we make the list of files")
list_of_files = glob.glob('//...Alert/Testing Folder/*')
textfile = '...Alert/Logs/NoNewFiles_Log.txt'
textNewFiles = '//...Alert/Logs/NewFilesAdded_Log.txt'
latest_file = max(list_of_files, key=os.path.getctime)
m_time = os.path.getmtime(latest_file)
dt_m = datetime.datetime.fromtimestamp(m_time).strftime('%m-%d-%Y')
dt_t = datetime.datetime.fromtimestamp(m_time).strftime('%m-%d-%Y %H:%M:%S')
print(latest_file, dt_m)
today = datetime.date.today().strftime('%m-%d-%Y')
#todays date and time
now = datetime.datetime.now()
dt_string = now.strftime("%m/%d/%Y %H:%M:%S")
#this is where you send the emails and create the alert
if dt_m != today:
print("No new files were added today")
notification.notify(
title = 'Alert! No New CDR Files',
message = 'No new files were added today. Check text file log \\...Alert\Logs',
app_icon = None,
timeout = 20,
)
#writes to a log file
f = open(textfile, "a")
f.write("Program ran on "+dt_string+" and it was found that no new files were added. Last file update was on "+dt_t+"\n--------\n")
f.close()
#powershell path
#This grabs a powershell script I made and sends an email out.
ps = '//...Alert/Emailing_Users_Scripts/No_New_File_Email_Alert.ps1'
p = subprocess.Popen(["powershell",ps], stdout=subprocess.PIPE)
p_out, p_err = p.communicate()
print(p_out)
elif dt_m == today:
print("New files added today. Ending with: ", latest_file)
#writes to a log file
f = open(textNewFiles, "a")
f.write("Program ran on "+dt_string+" and new files were added today. Ending with: "+latest_file+". Last file update was on "+dt_t+"\n--------\n")
f.close()
#powershell path
#This grabs a powershell script I made and sends an email out.
ps = '//...Alert/Emailing_Users_Scripts/Email_Alert.ps1'
#p = subprocess.Popen(["powershell","-ExecutionPolicy","Unrestricted",ps], stdout=subprocess.PIPE)
p = subprocess.Popen(["powershell", ps], stdout=subprocess.PIPE)
p_out, p_err = p.communicate()
print(p_out)
print(p_err)
I'm trying to make a script that sends an e-mail once it is detected that I copy a file in a specified folder, using some parts of code that I found on internet researching and now I'm stuck with this error.
import time
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
import smtplib
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
from email.mime.base import MIMEBase
from email import encoders
import os.path
class Watcher:
DIRECTORY_TO_WATCH = "/Documents/ReportesSemanales"
def __init__(self):
self.observer = Observer()
def run(self):
event_handler = Handler()
self.observer.schedule(event_handler, self.DIRECTORY_TO_WATCH, recursive=True)
self.observer.start()
try:
while True:
time.sleep(5)
except:
self.observer.stop()
print ("Error")
self.observer.join()
class Handler(FileSystemEventHandler):
#staticmethod
def on_any_event(event):
if event.is_directory:
return None
elif event.event_type == 'created':
# Take any action here when a file is first created.
def send_email(email_recipient,
email_subject,
email_message,
attachment_location = ''):
email_sender = 'MyUser#domain.com'
msg = MIMEMultipart()
msg['From'] = email_sender
msg['To'] = email_recipient
msg['Subject'] = email_subject
msg.attach(MIMEText(email_message, 'plain'))
if attachment_location != '':
filename = os.path.basename(attachment_location)
attachment = open(attachment_location, "rb")
part = MIMEBase('application', 'octet-stream')
part.set_payload(attachment.read())
encoders.encode_base64(part)
part.add_header('Content-Disposition',
"attachment; filename= %s" % filename)
msg.attach(part)
try:
server = smtplib.SMTP('smtp.office365.com', 587)
server.ehlo()
server.starttls()
server.login('MyUser#domain.com', 'MyPassword')
text = msg.as_string()
server.sendmail(email_sender, email_recipient, text)
print('email sent')
server.quit()
except:
print("SMPT server connection error")
return True
send_email('MyUser#hotmail.com',
'Happy New Year',
'We love Outlook',
'/ReportesSemanales/Bitacora-Diaria.xlsx')
print("Received created event - %s." % event.src_path)
elif event.event_type == 'modified':
# Taken any action here when a file is modified.
print("Received modified event - %s." % event.src_path)
if __name__ == '__main__':
w = Watcher()
w.run()
I already installed watchdog api and when I run the script, I receive the error in terminal:
Traceback (most recent call last):
File "SendEmail.py", line 90, in <module>
w.run()
File "SendEmail.py", line 22, in run
self.observer.start()
File "C:\Users\MyUser\AppData\Local\Programs\Python\Python38-32\lib\site-packages\watchdog\observers\api.py", line 260, in start
emitter.start()
File "C:\Users\MyUser\AppData\Local\Programs\Python\Python38-32\lib\site-packages\watchdog\utils\__init__.py", line 110, in start
self.on_thread_start()
File "C:\Users\MyUser\AppData\Local\Programs\Python\Python38-32\lib\site-packages\watchdog\observers\read_directory_changes.py", line 66, in on_thread_start
self._handle = get_directory_handle(self.watch.path)
File "C:\Users\MyUser\AppData\Local\Programs\Python\Python38-32\lib\site-packages\watchdog\observers\winapi.py", line 307, in get_directory_handle
return CreateFileW(path, FILE_LIST_DIRECTORY, WATCHDOG_FILE_SHARE_FLAGS,
File "C:\Users\MyUser\AppData\Local\Programs\Python\Python38-32\lib\site-packages\watchdog\observers\winapi.py", line 113, in _errcheck_handle
raise ctypes.WinError()
FileNotFoundError: [WinError 3] The system cannot find the path specified.
How can I solve this?
Specifications:
OS: Windows 10
Python Version: Python 3.8.3
Editing: Visual Studio
This line:
FileNotFoundError: [WinError 3] The system cannot find the path specified.
Means just what it says: somewhere during the execution of the python script, one of the paths that you specified in your code was not found.
This is commonly caused by typos in path definitions. In your case, it might be caused by mistakenly using forward slashes ( / ) in your paths instead of backslashes, ( \ ). While forward slashes are used in Linux / UNIX systems, Windows uses backslashes.
Try changing this line:
DIRECTORY_TO_WATCH = "/Documents/ReportesSemanales"
to this:
DIRECTORY_TO_WATCH = "\Documents\ReportesSemanales"
And do the same for the send_email() function call, where you specify the path to the .xlsx file. If you still have errors, check if you have any typos in the path, and whether the folders and files that you specified actually exist.
I have a script that runs main() and at the end I want to send the contents it has by e-mail. I don't want to write new files nor anything. Just have the original script be unmodified and at the end just send the contents of what it printed. Ideal code:
main()
send_mail()
I tried this:
def main():
print('HELLOWORLD')
def send_email(subject='subject', message='', destination='me#gmail.com', password_path=None):
from socket import gethostname
from email.message import EmailMessage
import smtplib
import json
import sys
server = smtplib.SMTP('smtp.gmail.com', 587)
smtplib.stdout = sys.stdout # <<<<<-------- why doesn't it work?
server.starttls()
with open(password_path) as f:
config = json.load(f)
server.login('me#gmail.com', config['password'])
# craft message
msg = EmailMessage()
#msg.set_content(message)
msg['Subject'] = subject
msg['From'] = 'me#gmail.com'
msg['To'] = destination
# send msg
server.send_message(msg)
if __name__ == '__main__':
main()
send_mail()
but it doesn't work.
I don't want to write other files or change the original python print statements. How to do this?
I tried this:
def get_stdout():
import sys
print('a')
print('b')
print('c')
repr(sys.stdout)
contents = ""
#with open('some_file.txt') as f:
#with open(sys.stdout) as f:
for line in sys.stdout.readlines():
contents += line
print(contents)
but it does not let me read sys.stdout because it says its not readable. How can I open it in readable or change it to readable in the first place?
I checked all of the following links but none helped:
How to send output from a python script to an email address
https://www.quora.com/How-can-I-send-an-output-from-a-Python-script-to-an-email-address
https://bytes.com/topic/python/answers/165835-email-module-redirecting-stdout
Redirect stdout to a file in Python?
How to handle both `with open(...)` and `sys.stdout` nicely?
Capture stdout from a script?
To send e-mails I am using:
def send_email(subject, message, destination, password_path=None):
from socket import gethostname
from email.message import EmailMessage
import smtplib
import json
server = smtplib.SMTP('smtp.gmail.com', 587)
server.starttls()
with open(password_path) as f:
config = json.load(f)
server.login('me123#gmail.com', config['password'])
# craft message
msg = EmailMessage()
message = f'{message}\nSend from Hostname: {gethostname()}'
msg.set_content(message)
msg['Subject'] = subject
msg['From'] = 'me123#gmail.com'
msg['To'] = destination
# send msg
server.send_message(msg)
note I have my password in a json file using an app password as suggested by this answer https://stackoverflow.com/a/60996409/3167448.
using this to collect the contents from stdout by writing it to a custom stdout file using the builtin function print:
import sys
from pathlib import Path
def my_print(*args, filepath='~/my_stdout.txt'):
filepath = Path(filepath).expanduser()
# do normal print
__builtins__['print'](*args, file=sys.__stdout__) #prints to terminal
# open my stdout file in update mode
with open(filepath, "a+") as f:
# save the content we are trying to print
__builtins__['print'](*args, file=f) #saves in a file
def collect_content_from_file(filepath):
filepath = Path(filepath).expanduser()
contents = ''
with open(filepath,'r') as f:
for line in f.readlines():
contents = contents + line
return contents
Note the a+ to be able to create the file if it already does NOT exist.
Note that if you want to delete the old contents of your custom my_stdout.txt you need to delete the file and check if it exists:
# remove my stdout if it exists
os.remove(Path('~/my_stdout.txt').expanduser()) if os.path.isfile(Path('~/my_stdout.txt').expanduser()) else None
The credits for the print code are from the answer here: How does one make an already opened file readable (e.g. sys.stdout)?
I have a python script that uses paramiko to get some files. The script calls a yaml config file to obtain the credentials for each client. I now have the need to use a private key for a new client (I can make that work by itself) but I'm unable to use the script with both key and username/password supplied. How can I get the script to use the private key if it is supplied or the username/password to login to sftp? Right now when running the script and there is nothing in the private key the script fails.
import paramiko, StringIO
import logging
import os
import yaml
try:
cfg=yaml.load(open("config.yml"))["local"]
paramiko.util.log_to_file('sftpin.log')
sftp = paramiko.Transport(cfg["host"], cfg["port"])
key_string = cfg["key"]
not_really_a_file = StringIO.StringIO(key_string)
private_key = paramiko.RSAKey.from_private_key(not_really_a_file,
password=cfg["key_passwd"])
sftp.connect(username=cfg["user"], password=cfg["pass"]
pkey=private_key)
sftp = paramiko.SFTPClient.from_transport(sftp)
print 'Connecting'
files = sftp.listdir(cfg["remote_outbox"])
print 'Listing Contents'
print files
print 'Getting Files'
for f in files:
sftp.get(cfg["remote_outbox"]+f, cfg["local_inbox"]+f)
print 'Job Completed'
finally:
try:
sftp.close()
except NameError:
print 'SFTP is not defined'
Any help is greatly appreciated!
Hey Danny try follow snippet. Note: key value from config.yml is a full path to RSA private key file (like: /home/user/.ssh/id_rsa).
import paramiko
import StringIO
import yaml
try:
cfg=yaml.load(open("config.yml"))["local"]
paramiko.util.log_to_file('sftpin.log')
sftp = paramiko.Transport(cfg["host"], cfg["port"])
key_string = cfg["key"]
private_key = None
if key_string is not None:
f = open(key_string, 'r')
s = f.read()
not_really_a_file = StringIO.StringIO(s)
private_key = paramiko.RSAKey.from_private_key(not_really_a_file,
password=cfg["key_passwd"])
sftp.connect(username=cfg["user"],
password=cfg["pass"],
pkey=private_key)
sftp = paramiko.SFTPClient.from_transport(sftp)
print 'Connecting'
files = sftp.listdir(cfg["remote_outbox"])
print 'Listing Contents'
print files
print 'Getting Files'
for f in files:
remote = "%s%s" %(cfg["remote_outbox"], f)
local = "%s%s" %(cfg["local_inbox"], f)
try:
sftp.get(remote, local)
except IOError:
print "Failed to copy %s" % remote
print 'Job Completed'
finally:
sftp.close()
Regards KG.
Problem Statement:
I have multiple(1000+) *.gz files in a remote server. I have to read these files and check for certain strings. If the strings matches, I have to return the file name. I have tried the following code. The following program is working but doesnot seem efficient as there is a huge IO involved. Can you please suggest an efficient way to do this.
My Code:
import gzip
import os
import paramiko
import multiprocessing
from bisect import insort
synchObj=multiprocessing.Manager()
hostname = '192.168.1.2'
port = 22
username='may'
password='Apa$sW0rd'
def miniAnalyze():
ifile_list=synchObj.list([]) # A synchronized list to Store the File names containing the matched String.
def analyze_the_file(file_single):
strings = ("error 72","error 81",) # Hard Coded the Strings that needs to be searched.
try:
ssh=paramiko.SSHClient()
#Code to FTP the file to local system from the remote machine.
.....
........
path_f='/home/user/may/'+filename
#Read the Gzip file in local system after FTP is done
with gzip.open(path_f, 'rb') as f:
contents = f.read()
if any(s in contents for s in strings):
print "File " + str(path_f) + " is a hit."
insort(ifile_list, filename) # Push the file into the list if there is a match.
os.remove(path_f)
else:
os.remove(path_f)
except Exception, ae:
print "Error while Analyzing file "+ str(ae)
finally:
if ifile_list:
print "The Error is at "+ ifile_list
ftp.close()
ssh.close()
def assign_to_proc():
# Code to glob files matching a pattern and pass to another function via multiprocess .
apath = '/home/remotemachine/log/'
apattern = '"*.gz"'
first_command = 'find {path} -name {pattern}'
command = first_command.format(path=apath, pattern=apattern)
try:
ssh=paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(hostname,username=username,password=password)
stdin, stdout, stderr = ssh.exec_command(command)
while not stdout.channel.exit_status_ready():
time.sleep(2)
filelist = stdout.read().splitlines()
jobs = []
for ifle in filelist:
p = multiprocessing.Process(target=analyze_the_file,args=(ifle,))
jobs.append(p)
p.start()
for job in jobs:
job.join()
except Exception, fe:
print "Error while getting file names "+ str(fe)
finally:
ssh.close()
if __name__ == '__main__':
miniAnalyze()
The above code is slow. There are lot of IO while getting the GZ file to local system. Kindly help me to find a better way to do it.
Execute a remote OS command such as zgrep, and process the command results locally. This way, you won't have to transfer the whole file contents on your local machine.