I wrote a python script to monitor a log file on a CentOS server for a specific value and send an email when it finds it. It runs as a cron every 5 minutes.
My question is what is the best way to put this script to sleep after it has sent the first email. I don't want it to be sending emails every 5 mins, but it needs to wake up and check the log again after an hour or so. This is assuming the problem can be fixed in under an hour. The people who are receiving the email don't have shell access to disable the cron.
I thought about sleep but I'm not sure if cron will try to run the script again if another process is active (sleeping).
cron will absolutely run the script again. You need to think this through a little more carefully than just "sleep" and "email every 10 minutes."
You need to write out your use cases.
System sends message and user does something.
System sends message and user does nothing. Why email the user again? What does 2 emails do that 1 email didn't do? Perhaps you should SMS or email someone else.
How does the user register that something was done? How will they cancel or stop this cycle of messages?
What if something is found in the log, an email is sent and then (before the sleep finishes) the thing is found again in the log. Is that a second email? It is two incidents. Or is that one email with two incidents?
#Lennart, #S. Lott: I think the question was somewhat the other way around - the script runs as a cron job every five minutes, but after sending an error-email it shouldn't send another for at least an hour (even if the error state persists).
The obvious answer, I think, is to save a self-log - for each problem detected, an id and a timestamp for the last time an email was sent. When a problem is detected, check the self-log; if the last email for this problem-id was less than an hour ago, don't send the email. Then your program can exit normally until called again by cron.
When your scripts sends email, make it also create a txt file "email_sent.txt". Then make it check for existence of this txt file before sending email. If it exists, don't send email. If it does not exist, send email and create the text file.
The text files serves as an indicator that email has already been sent and it does not need to be sent again.
You are running it every five minutes. Why would you sleep it? Just exit. If you want to make sure it doesn't send email every five minutes, then make the program only send an email if there is anything to send.
If you sleep it for an hour, and run it every five minutes, after an hour you'll have 12 copies running (and twelve emails sent) so that's clearly not the way to go forward. :-)
Another way to go about this might be to run your script as a daemon and, instead of having cron run it every five minutes, put your logic in a loop. Something like this...
while True:
# The check_my_logfile() looks for what you want.
# If it finds what you're looking for, it sends
# an email and returns True.
if check_my_logfile():
# Then you can sleep for 10 minutes.
time.sleep(600)
# Otherwise, you can sleep for 5 minutes.
else:
time.sleep(300)
Since you are monitoring a log file, It might be worth checking into things that already do log file monitoring. Logwatch is one, but there are log analyzing tools, that handle all of these things for you:
http://chuvakin.blogspot.com/2010/09/on-free-log-management-tools.html
Is a good wrap-up of some options. They would handle yelling at people. Also there are system monitoring tools such as opennms or nagios, etc. They also do these things.
I agree with what other people have said above, basically cron ALWAYS runs the job at the specified time, there is a tool called at which lets you run jobs in the future, so you could batch a job for 5 minutes, and then at runtime decide, when do I need to run again, and submit a job to at for whatever time you need it to run again (be it 5 minutes, 10 minutes or an hour). You'd still need to keep state somewhere (like what #infrared said) that would figure out what got sent when, and if you should care some more.
I'd still suggest using a system monitoring tool, which would easily grow and scale and handles people being able to say 'I'm working on XX NOW stop yelling at me' for instance.
Good luck!
Related
This is one of the trickiest issues I have ever been faced with in my 15 year programming career.
The setup is a deployed Django app. We have a feature which allow users to invite other users.
I loop through a list of email addresses. In each iteration of the loop, I do a bunch of things in order to provision the user in our system, send them a welcome email, and record this event in 3rd party systems.
The body of the loop looks like this:
try:
# ... some other code ...
send_new_user_welcome_email(
user,
self.inviter,
temp_password,
welcome_message=welcome_message,
is_reviewer=True
)
analytics_record_event(
self.inviter,
EVENT_INVITED_USER,
invite_type='reviewer',
invited_email=email
)
record_customer_io_invited_someone(
self.inviter,
email
)
except:
logger.exception('While inviting user "%s"' % email)
The problem I am seeing is that occasionally (about once every 50 or so times someone invites one or more users), the analytics_record_event function does not seem to execute. It does not raise any exceptions. It is simply skipped over and the next line executes.
In order to diagnose the issue, I have added logging to the analytics_record_event function to log to a file every time it gets called:
def analytics_record_event(user,
event_name,
skip_mixpanel=False,
skip_preact=False,
**properties):
username = user.username if hasattr(user, 'username') else user
logger.info(
u'analytics_record_event called for user %s and event "%s"' % (username, event_name)
)
I am looking for ideas as to how this could be. I have already spent a lot of time looking into this, and my findings are below:
send_new_user_welcome_email gets called. Our SMTP server logs confirm that the email goes out. An internal email log collaborates this as well.
record_customer_io_invited_someone gets called. The event data exists for that user in the external system.
No entries for the u'analytics_record_event... log statement can be found in my log files for the times when this is failing. At other times, when this function is executing fine (majority of the time), the log entries are there.
No exception logs for the above except statement exist.
No un-handled exceptions in this timeframe are found.
When it fails, it fails for each iteration of the loop for every email being invited as part of the whole invite users request. It fails for every iteration of the loop. There has not been one case where an invite request has been sent, and it worked for some of the invited emails and failed for some within that one request.
It does not seem to be data-related. The email addresses do not contain any strange characters.
It looks like occasionally, the call to analytics_record_event is simply not being made. This is causing me great grief. If anyone can suggest a path of investigation for this, I would greatly appreciate it.
The answer turned out to be surprising and extremely mundane.
The code I was tracing through was part of the standard user creation process in the system I am working on. However, there is a bulk-creation process based on CSV files which does analogous work. analytics_record_event was not being called in that process.
Lessons for the future:
- I thought I was being clever and covering a lot of ground by putting in a function decorator on analytics_record_event. The results of this made me even more puzzled, leading to this desperate plea for help :). Had I added a log statement before and after the call to analytics_record_event in this particular function, I would have realized earlier that it is not being called at all! KISS at work.
I have script which can be run by any user who is connected to a server. This script writes to a single log file, but there is no restriction on who can use it at one time. So multiple people could attempt to write to the log and data might be lost. Is there a way for one instance of the code to know if other instances of that code are running? Moreover, is it possible to gather this information dynamically? (ie not allow data saving for the second user until the first user has completed hes/her task)
I know I could do this with a text file. So I could write the user name to the file when the start, then delete it when they finish, but this could lead to errors if the either step misses, such as an unexpected script termination. So what other reliable ways are there?
Some information on the system: Python 2.7 is installed on a Windows 7 64-bit server via Anaconda. All connected machines are also Windows 7 64-bit. Thanks in advance
Here is an implementation:
http://www.evanfosmark.com/2009/01/cross-platform-file-locking-support-in-python/
If you are using a lock, be aware that stale locks (that are left by hung or crashed processes) can be a bitch. Have a process that periodically searches for locks that were created longer than X minutes ago and free them.
It just in't clean allowing multiple users to write to a single log and hoping things go ok..
why dont you write a daemon that handles logs? other processes connect to a "logging port" and in the simplest case they only succeed if no one else has connected.
you can just modify the echoserver example given here: (keep a timeout in the server for all connections)
http://docs.python.org/release/2.5.2/lib/socket-example.html
If you want know exactly who logged what, and make sure no one unauthorized gets in, you can use unix sockest to restrict it to only certain uids/gids etc.
here is a very good example
NTEventLogHandler is probably the easiest way for logging to a given Windows machine/server, but it might make more sense to use SyslogHandler if you have a syslog sink on a Unix server.
The catch I can think of with SyslogHandler is that you'll likely need to poke holes through the Windows firewall in order to send packets over the syslog protocol, i.e., 514/TCP ("reliable syslog") and 514/UDP (traditional or "unreliable syslog").
Need to read an Apache log file realtime from the server and if some string is found an e-mail has to be sent. I have adopted the code found here to read the log file. Next how to send this e-mail. Do I have to issue a sleep command? Please advice.
Note: Since this is real time, after sending the e-mail python program has to begin reading the log file again. This process continues.
import time
import os
#open the file
filename = '/var/log/apache2/access.log'
file = open(filename,'r')
while 1:
where = file.tell()
line = file.readline()
if not line:
time.sleep(1)
file.seek(where)
else:
if 'MyTerm' in line:
print line
Well, if you want it real time and not to be stuck on sending mails you could start a separate thread to send the email. Here is how you use threads in python (thread and threading):
http://www.tutorialspoint.com/python/python_multithreading.htm
Next, you can easily send an email in python using the smtplib. Here is another example from the same website (which I use and it is pretty good):
http://www.tutorialspoint.com/python/python_sending_email.htm
Well, you need to do this to speed up as much as possible the log-reading thread and to be sure it won't wait for mailing.
Now some pitfalls you have to take care of:
You must be careful with starting too many threads. For instance you are parsing (let's just assume for the moment) the log every 1 second, but sending an email takes 10 seconds. It is easy to see that (this is an exaggerated example, of course) you will start many threads and you will fill the available resources. I don't know how many times the string you are expecting will pop out each second, but it is a scenario you must consider.
Again depending on the workload, you can implement a streaming algorithm and avoid emails entirely. I don't know if it applies in your case, but I prefer to remind you about this scenario too.
You can create a queue and put a certain number of messages in it and send them together, thus avoiding sending many mails at once (again assuming you don't need to trigger an alarm for every single occurrence of your target string you have).
UPDATE
If you really want to create the perfect program you can do something else by using event triggering when the log file is modified. This way, you will avoid sleep entirely and each time something has been added to the file, python will be called and you can parse the new content and send the email if required. Take a look at watchdog:
http://pythonhosted.org/watchdog/
and this:
python pyinotify to monitor the specified suffix files in a dir
https://github.com/seb-m/pyinotify
To work a bit on my Python, I decided to try to code a simple script for my private use which monitors sites with offers and sends you an email whenever a new offer which you are interested in pops out. I guess I could handle the coding part (extracting the newest one from HTML and such) but I've never really run online any script which requires being fired every N minutes or so. What kind of hosting/server do I need to make my script run independently of my computer and refresh every, say, 5 minutes sending me an email when there's an update?
If you have shell access, you an use crontab to schedule a recurring job.
Otherwise you can use a service like SetCronJob or EasyCron or similar to invoke a script regularly.
Some hosters also provide similar functionalities in their administration interface...
I'm using pyHook to hook the keyboard and the mouse actions. A log of the hooks is being created while the program is running. I want to send the log to an email every 5 minutes - while the PunpMessages is still working and the hooking action is still running. I know how to send emails in python, however, I don't know how to do it every fixed interval.
you can try python schedulers to call your methods on given interval.
You may want to look at this module:
http://packages.python.org/APScheduler/