Python TimedRotatingFileHandler - logs are missing

Python TimedRotatingFileHandler - logs are missing - python

I am running my python application in apache environment and using timedRotatingFileHandler to log.
I have setup logger in a way that It is supposed to rotate midnight everyday. My all processes writes into the same logger file. Somehow logger is missing to log info at times. And sometimes I see logger writing into two files (old file and rotated file) at the same time.
I couldn't able to understand why is this happening? Doesn't TimedrotatingFileHandler work in multiprocess enivironment? If not why is that so?
Please help me to understand..

You can't log (and rotate) to the same file from multiple processes naively because OS wouldn't know how to serialize the write and rotate instructions from 2 different processes. What you are experiencing is known as a race condition as 2 processes are competing to write to the same file and close it and open with a new file handle at the same time at rotation time. Only 1 process will win a new file handle when you rotate, so that may explain the missing log event.
Here's a recipe from Python's documentation with hints about how to log to the same place.
http://docs.python.org/howto/logging-cookbook.html#logging-to-a-single-file-from-multiple-processes
Essentially you will want to have a separate process listening to logging events coming from multiple places and then that process will log the events to a single file. You can configure rotation in that listener process too.
If you are not sure how to write this, you can try using a package such as Sentry or Facebook's Scribe. I recommend Sentry because Scribe is not trivial to setup.

Related

File descriptor doesn't update for logging in Django

We use Python(2.7)/Django(1.8.1) and Gunicorn(19.4.5) for our web application and supervisor(3.0) to monitor it. I have recently encountered 2 issues in logging:
Django was logging into previous day logs(We have log rotation enabled)
Django was not logging anything at all.
The first scenario is understandable where the log rotation changed the file but Django was not updated.
The second scenario fixed when I restarted the supervisor process. Which led me to believe again the file descriptor was not updated in the django process.
I came by this SO thread which states:
Each child is an independent process, and file handles in the parent
may be closed in the child after a fork (assuming POSIX). In any case,
logging to the same file from multiple processes is not supported.
So I have few questions:
My gunicorn has 4 child processes and if one of them fails while
writing to a log file will the other child process won't be able to
use it? and how to debug these kind of scenarios?
Personally I found debugging errors in python logging module to be
difficult. Can some one point how to debug errors such as this or is
there any way I can monkey patch logging to not fail silently?*(Kindly read update section)*
I have seen Django LogRotation causes the Issue type 1 as explained above and not some script scheduled via cron. So what is preferable?
Note: The logging config is not a problem. I have already spent fair amount of time trying to figure that out. Also if the config is the issue Django will not write log files after a process restart.
Update:
For my second question I see that logging modules provides an option to raiseExceptions on failure although this is discourages in production environment. Documentation here. So now my question becomes how do I set this in Django?

I felt like closing this question. Bit awkward and seems stupid after 2 months. But I guess being stupid is part of the learning and want this to be as a reference for people who stumble across this.
Scenario 1: Django on using TimedRotatingFileHandler seems not to update the file descriptor some times and hence writes to old log files unless we restart the supervisor. We are yet to find the reason for this behaviour and update the reason if found. For now we are using WatchedFileHandler and then using logrotate utility to rotate the logs.
Scenario 2: This is the stupid question. When I was logging with some string formatting I forgot to give enough variables which is why the logger was erring. But this didn't get propagated. But locally when I was testing I found that logging module was actually throwing that error but silently and any logs after it in the module were not getting printed. Lessons learns from this scenario were:
If there is a problem in logging find out if the string formatting does not err
Using log.debug('example: {msg}'.format(msg=msg)) of python instead of log.debug('example: %s', msg).

What happens if I log into the same file from multiple different processes in python?

I spent hours to dig the behavior, first about those questions:
Atomicity of `write(2)` to a local filesystem
How can I synchronize -- make atomic -- writes on one file from from two processes?
How does one programmatically determine if "write" system call is atomic on a particular file?
What happens if a write system call is called on same file by 2 different processes simultaneously
http://article.gmane.org/gmane.linux.kernel/43445
It seems if we use 'O_APPEND' flag when opening file, it will always be ok to logging to same file from multiple processes, on linux. And i believe python surely use 'O_APPEND' flag in its logging module.
And from a small test:
#!/bin/env python
import os
import logging
logger = logging.getLogger('spam_application')
logger.setLevel(logging.DEBUG)
# create file handler which logs even debug messages
fh = logging.FileHandler('spam.log')
logger.addHandler(fh)
formatter = logging.Formatter(
'%(asctime)s - %(name)s - %(levelname)s - %(message)s')
fh.setFormatter(formatter)
for i in xrange(10000):
p = os.getpid()
logger.debug('Log line number %s in %s', i, p)
And i run it with:
./test.py & ./test.py & ./test.py & ./test.py &
I found there were nothing wrong in spam.log. This behavior may support the conclusion above.
But problems comming:
What does this mean here?
And what are the scenes to use this, just for file rotation?
At last, if two process are doing write on one same file, i mean they are invoking write(2) on the same file, who make sure the data from the two processes not interleave(kernel or filesystem?), and how.[NOTE: i just want to see in deep about the write syscall, any hit about this is welcome.]
EDIT1 :
Do this and this just exist there for compatibility between different os environments, like windows, linux, or mac?
EDIT2 :
One more test, feed 8KB strings to logging.debug each time. And this time i can see the "interleaving" behavior in spam.log.
This behavior is just what specified about PIPE_BUF in one page above. So it seems the behavior is clear on linux, using O_APPEND is ok if the size to write(2) is less then PIPE_BUF.

I digged deeper and deeper. Now I think it's clear about these facts:
With O_APPEND, parallel write(2) from multiple processes is ok. It's just order of lines are undetermined, but lines don't interleave or overwrite each other. And the size of data is by any amount, according to Niall Douglas's answer for Understanding concurrent file writes from multiple processes. I have tested about this "by any amount" on linux and have not found the upper limit, so i guess it's right.
Without O_APPEND, it will be a mess. Here is what POSIX says "This volume of POSIX.1-2008 does not specify behavior of concurrent writes to a file from multiple processes. Applications should use some form of concurrency control."
Now we come into python. The test i did in EDIT3, that 8K, i found it's origin. Python's write() use fwrite(3) in fact， and my python set a BUFF_SIZE here which is 8192. According to an answer from abarnert in Default buffer size for a file on Linux. This 8192 has a long story.
However, more information is welcome.

I would not rely on tests here. Weird things can only happen in race conditions, and exhibiting a race condition by test is almost non sense because the race condition is unlikely to occur. So it can work nicely for 1000 test runs and randomly break later in prod... The page you cite says:
logging to a single file from multiple processes is not supported, because there is no standard way to serialize access to a single file across multiple processes in Python
That does not means that it will break... it could even be safe in a particular implementation on a particular file system. It just means that it can break without any hope of fix on any other version of Python or on any other filesystem.
If you really want to make sure of it, you will have to dive into Python source code (for your version) to control how logging is actually implemented, and control whether is it safe on your file system. And you will always be threatened by the possibility that a later optimization in logging module breaks your assumptions.
IMHO that is the reason for the warning in Logging Cookbook, and the existence of a special module to allow concurrent logging to the same file. This last one does not rely on anything unspecified, but just uses explicit locking.

I have tried a similar code like this(I have tried in python 3)
import threading
for i in range(0,100000):
t1 = threading.Thread(target= funtion_to_call_logger, args=(i,))
t1.start()
This worked completely fine for me, similar issue is addressed here.
This took a lot of Cpu time but not memory.
EDIT:
Fine means all the requested things were logged but Order was missing.Hence Race condition still not fixed ,

Python2: How to parse a logfile that is held open in another process reliably?

I'm trying to write a Python script that will parse a logfile produced by another daemon. This is being done on Linux. I want to be able to parse the log file reliably.
In other words, periodically, we run a script that reads the log file, line by line, and does something with each line. The logging script would need to see every line that may end up in the log file. It could run say once per minute via cron.
Here's the problem that I'm not sure exactly how to solve. Since the other process has a write handle to the file, it could write to the while at the same time that I am reading from the same log file.
Also, every so often we would want to clear this logfile so its size does not get out of control. But the process producing the log file has no way to clear the file other than regularly stopping, truncating or deleting the file, and then restarting. (I feel like logrotate has some method of doing this, but I don't know if logrotate depends on the daemon being aware, or if it's actually closing and restarting daemons, etc. Not to mention I don't want other logs rotated, just this one specific log; and I don't want this script to require other possible users to setup logrotate.)
Here's the problems:
Since the logger process could write to the file while I already have an open file handle, I feel like I could easily miss records in the log file.
If the logger process were to decide to stop, clear the log file, and restart, and the log analyzer didn't run at exactly the same time, log entries would be lost. Similarly, if the log analyzer causes the logger to stop logging while it analyzes, information could also be lost that is dropped because the logger daemon isn't listening.
If I were to use a method like "note the size of the file since last time and seek there if the file is larger", then what would happen if, for some reason, between runs, the logger reset the logfile, but then had reason to log even more than it contained last time? E.g. We execute a log analyze loop. We get 50 log entries, so we set a mark that we have read 50 entries. Next time we run, we see 60 entries. But, all 60 are brand new; the file had been cleared and restarted since the last log run. Instead we end up seeking to entry 51 and missing 50 entries! Either way it doesn't solve the problem of needing to periodically clear the log.
I have no control over the logger daemon. (Imagine we're talking about something like syslog here. It's not syslog but same idea - a process that is pretty critical holds a logfile open.) So I have no way to change its logging method. It starts at init time, opens a log file, and writes to it. We want to be able to clear that logfile AND analyze it, making sure we get every log entry through the Python script at some point.
The ideal scenario would be this:
The log daemon runs at system init.
Via cron, the Python log analyzer runs once per minute (or once per 5 minutes or whatever is deemed appropriate)
The log analyzer collects every single line from the current log file and immediately truncates it, causing the log file to be blanked out. Python maintains the original contents in a list.
The logger then continues to go about its business, with the now blank file. In the mean time, Python can continue to parse the entries at its leisure from the Python list in memory.
I've very, very vaguely studied fifo's, but am not sure if that would be appropriate. In that scenario the log analyzer would run as a daemon itself, while the original logger writes to a FIFO. I have very little knowledge in this area however and don't know if it'd be a solution or not.
So I guess the question really is twofold:
How to reliably read EVERY entry written to the log from Python? Including if the log grows, is reset, etc.
How, if possible to truncate a file that has an open write handle? (Ideally, this would be something I could do from Python; I could do something like logfile.readlines(); logfile.truncate so that way no entries would get lost. But this seems like unless the logger process was well aware of this, it'd end up causing more problems than it solves.)
Thanks!

I don’t see any particular reason why you should be not able to read log file created by syslogd. You are saying that you are using some process similar to syslog, and process is keeping your log file open? Since you are asking rather for ideas, I would recommend you to use syslog! http://pic.dhe.ibm.com/infocenter/tpfhelp/current/index.jsp?topic=%2Fcom.ibm.ztpf-ztpfdf.doc_put.cur%2Fgtpc1%2Fhsyslog.html
It is working anyway – use it. Some easy way to write to log is to use logger command:
logger “MYAP: hello”
In python script you can do it like:
import os
os.system(‘logger “MYAP: hello”’)
Also remember you can actually configure syslogd. http://pic.dhe.ibm.com/infocenter/tpfhelp/current/index.jsp?topic=%2Fcom.ibm.ztpf-ztpfdf.doc_put.cur%2Fgtpc1%2Fconstmt.html
Also about your problem with empty logs – sysclog is not clearing logs. There are other tools for it – on debian for example logrotate is used. In this scenario if your log is empty – you can check backup file created by logrotate.
Since it looks like your problem is in logging tool, my advise would be to use syslog for logging. And other tool for rotating logs. Then you can easily parse logs. And if by any means (I don’t know if it is even possible with syslog) you miss some data – remember you will get it in next iteration anyway ;)
Some other idea would be to copy your logfile and work with copy...

Complete log management (python)

Similar questions have been asked, but I have not come across an easy-to-do-it way
We have some application logs of various kinds which fill up the space and we face other unwanted issues. How do I write a monitoring script(zipping files of particular size, moving them, watching them, etc..) for this maintenance? I am looking for a simple solution(as in what to use?), if possible in python or maybe just a shell script.
Thanks.

The "standard" way of doing this (atleast on most Gnu/Linux distros) is to use logrotate. I see a /etc/logrotate.conf on my Debian machine which has details on which files to rotate and at what frequency. It's triggered by a daily cron entry. This is what I'd recommend.
If you want your application itself to do this (which is a pain really since it's not it's job), you could consider writing a custom log handler. A RotatingFileHandler (or TimedRotatingFileHandler) might work but you can write a custom one.
Most systems are by default set up to automatically rotate log files which are emitted by syslog. You might want to consider using the SysLogHandler and logging to syslog (from all your apps regardless of language) so that the system infrastructure automatically takes care of things for you.

Use logrotate to do the work for you.
Remember that there are few cases where it may not work properly, for example if the logging application keeps the log file always open and is not able to resume it if the file is removed and recreated.
Over the years I encountered few applications like that, but even for them you could configure logrotate to restart them when it rotates the logs.

Effectively reading a large, active Python log file

When my Python script is writing a large amount of logs to a text file line by line using the Python built-in logging library, in my Delphi-powered Windows program I want to effectively read all newly added logs (lines).
When the Python scripting is logging
to the file, my Windows program will
keep a readonly file handle to
that log file;
I'll use the Windows API to get
informed when the log file is
changed; Once the file is changed, it'll read the newly appended lines.
I'm new to Python, do you see any possible problem with this approach? Does the Python logging lib lock the entire log? Thanks!

It depends on the logging handler you use, of course, but as you can see from the source code, logging.FileHandler does not currently create any file locks. By default, it opens files in 'a' (append) mode, so as long as your Windows calls can handle that, you should be fine.

As ʇsәɹoɈ commented, the standard FileHandler logger does not lock the file, so it should work. However, if for some reason you cannot keep you lock on the file - then I'd recommend having your other app open the file periodically, record the position it's read to and then seek back to that point later. I know the Linux DenyHosts program uses this approach when dealing with log files that it has to monitor for a long period of time. In those situations, simply holding a lock isn't feasible, since directories may move, the file get rotated out, etc. Though it does complicate things in that then you have to store filename + read position in persistent state somewhere.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.