I have a file results.txt on a server which is accessed by multiple VMs through NFS. A process runs on each of these VMs which reads the results.txt file and modifies it. If two processes, A and B, read the file at the same time, then modification of either A or B would be present in results.txt based on the order in which the processes write to the file.
If process A has a write lock over the file then process B would have to wait till the lock is released to read the results.txt file.
I have tried implementing this using Python:
import fcntl
f = open("/path/result.txt")
fcntl.flock(f,fcntl.LOCK_EX)
#code
It works as expected for files on the local disk.
but when I run try to lock a file on the mounted path, I get the following error:
Traceback (most recent call last):
File "lock.py", line 12, in <module>
fcntl.flock(f,fcntl.LOCK_EX)
IOError: [Errno 45] Operation not supported
I tried fcntl.fcntl and fcntl.flock but got the same error. Is this an issue with the way I am using fcntl? Is any configuration required on the server where file is stored?
Edit:
This is how I am using fcntl.fcntl:
f= open("results.txt")
lockdata = struct.pack('hhllhh', fcntl.F_RDLCK,0,0,0,0,0)
rv = fcntl.fcntl(f, fcntl.F_SETLKW, lockdata)
The NFS server version is 3.
I found flufl.lock most suited for my requirement.
Quoting the author from project page:
[...] O_EXCL is broken on NFS file systems, programs which rely on
it
for performing locking tasks will contain a race condition. The
solution for performing atomic file locking using a lockfile is to
create a unique file on the same fs (e.g., incorporating hostname and
pid), use link(2) to make a link to the lockfile. If link() returns
0, the lock is successful. Otherwise, use stat(2) on the unique file
to check if its link count has increased to 2, in which case the lock
is also successful.
Since it is not part of the standard library I couldn't use it. Also, my requirement was only a subset of all the features offered by this module.
The following functions were written based on the modules. Please make changes based on the requirements.
def lockfile(target,link,timeout=300):
global lock_owner
poll_time=10
while timeout > 0:
try:
os.link(target,link)
print("Lock acquired")
lock_owner=True
break
except OSError as err:
if err.errno == errno.EEXIST:
print("Lock unavailable. Waiting for 10 seconds...")
time.sleep(poll_time)
timeout-=poll_time
else:
raise err
else:
print("Timed out waiting for the lock.")
def releaselock(link):
try:
if lock_owner:
os.unlink(link)
print("File unlocked")
except OSError:
print("Error:didn't possess lock.")
This is a crude implementation that works for me. I have been using it and haven't faced any issues. There are many things that can be improved though. Hope this helps.
Related
In my Flask application I have implemented a logging system using the logging library. It is currently run in a function below:
if __name__ == "__main__":
"""[Runs the webserver.
Finally block is used for some logging management. It will first shut down
logging, to ensure no files are open, then renames the file to 'log_'
+ the current date, and finally moves the file to the /logs archive
directory]
"""
try:
session_management.clean_uploads_on_start(UPLOAD_FOLDER)
app.run(debug=False)
finally:
try:
logging.shutdown()
new_log_file_name = log_management.rename_log(app.config['DEFAULT_LOG_NAME'])
log_management.move_log(new_log_file_name)
except FileNotFoundError:
logging.warning("Current log file not found")
except PermissionError:
logging.warning("Permissions lacking to rename or move log.")
I discovered that the file is not renamed and moved if (either) the cmd prompt is force closed, or if the server crashes. I thought it might be better to put the rename and move into the initial 'try' block of the function, prior to the server starting, but I run into issues because I have a config file (which is imported in this script) which has the following code:
logging.basicConfig(filename='current_log.log', level=logging.INFO,
filemode='a',
format='%(asctime)s:%(levelname)s:%(message)s')
I have tried to do something like the below, but I still run into permission errors, but I think I am still running into errors because the log_management script also imports config. Further, I could not find a function which starts the logging system similar to logging.shutdown() which is used upon the system ending, otherwise I would shut it down, move the file (if it exists) and the start it back up.
try:
session_management.clean_uploads_on_start(UPLOAD_FOLDER)
log_management.check_log_on_startup(app.config['DEFAULT_LOG_NAME'])
import config
app.run(debug=False)
finally:
try:
logging.shutdown()
new_log_file_name = log_management.rename_log(app.config['DEFAULT_LOG_NAME'])
log_management.move_log(new_log_file_name)
except FileNotFoundError:
logging.warning("Current log file not found")
except PermissionError:
logging.warning("Permissions lacking to rename or move log.")
# (in another script)
def check_log_on_startup(file_name):
if os.path.exists(file_name):
move_log(rename_log(file_name))
Any suggestions much welcomed, because I feel like I'm at a brick wall!
As you have already found out, trying to perform cleanups at the end of your process life cycle has the potential to fail if the process terminates uncleanly.
The issue with performing the cleanup at the start is, that you apparently call logging.basicConfig from your import before attempting to move the old log file.
This leads to the implicitly created FileHandler holding an open file object on the existing log when you attempt to rename and move it. Depending on the file system you are using, this might not be met with joy.
If you want to move the handling of potential old log files to the start of your application completely, you have to perform the renaming and moving before you call logging.basicConfig, so you'll have to remove it from your import and add it to the log_management somehow.
As an alternative, you could move the whole handling of log files to the logging file handler by subclassing the standard FileHandler class, e.g:
import logging
import os
from datetime import datetime
class CustomFileHandler(logging.FileHandler):
def __init__(self, filename, archive_path='archive', archive_name='log_%Y%m%d', **kwargs):
self._archive = os.path.join(archive_path, archive_name)
self._archive_log(filename)
super().__init__(filename, **kwargs)
def _archive_log(self, filepath):
if os.path.exists(filepath):
os.rename(filepath, datetime.now().strftime(self._archive))
def close(self):
super().close()
self._archive_log(self.baseFilename)
With this, you would configure your logging like so:
hdler = CustomFileHandler('current.log')
logging.basicConfig(level=logging.INFO, handlers=[hdler],
format='%(asctime)s:%(levelname)s:%(message)s')
The CustomFileHandler will check for, and potentially archive, old logs during initialization. This will deal with leftovers after an unclean process termination where the shutdown cleanup cannot take place. Since the parent class initializer is called after the log archiving is attempted, there is not yet an open handle on the log that would cause a PermissionError.
The overwritten close() method will perform the archiving on a clean process shutdown.
This should remove the need for the dedicated log_management module, at least as far as the functions you show in your code are concerned. rename_log, move_log and check_log_on_startup are all encapsulated in the CustomFileHandler. There is also no need to explicitly call logging.shutdown().
Some notes:
The reason you cannot find a start function equivalent to logging.shutdown() is that the logging system is started/initialized when you import the logging module. Among other things, it instantiates the implicit root logger and registers logging.shutdown as exit handler via atexit.
The latter is the reason why there is no need to explicitly call logging.shutdown() with the above solution. The Python interpreter will call it during finalization when preparing for interpreter shutdown due to the exit handler registration. logging.shutdown() then iterates through the list of registered handlers and calls their close() methods, which will perform the log archiving during a clean shutdown.
Depending on the method you choose for moving (and renaming) the old log file, the above solution might need some additional safeguards against exceptions. os.rename will raise an exception if the destination path already exists, i.e. when you have already stopped and started your process previously on the same day while os.replace would silently overwrite the existing file. See more details about moving files via Python here.
Thus I would recommend to name the archived logs not only by current date but also by time.
In the above, adding the current date to the archive file name is done via datetime's strftime, hence the 'log_%Y%m%d' as default for the archive_name parameter of the custom file handler. The characters with a preceding % are valid format codes that strftime() replaces with the respective parts of the datetime object it is called on. To append the current time to the archive log file name you would simply append the respective format codes to the archive_name, e.g.: 'log_%Y%m%d_%H%M%S' which would result in a log name such as log_20200819_123721.
So I want to write some files that might be locked/blocked for write/delete by other processes and like to test that upfront.
As I understand: os.access(path, os.W_OK) only looks for the permissions and will return true although a file cannot currently be written. So I have this little function:
def write_test(path):
try:
fobj = open(path, 'a')
fobj.close()
return True
except IOError:
return False
It actually works pretty well, when I try it with a file that I manually open with a Program. But as a wannabe-good-developer I want to put it in a test to automatically see if it works as expected.
Thing is: If I just open(path, 'a') the file I can still open() it again no problem! Even from another Python instance. Although Explorer will actually tell me that the file is currently open in Python!
I looked up other posts here & there about locking. Most are suggesting to install a package. You migth understand that I don't wanna do that to test a handful lines of code. So I dug up the packages to see the actual spot where the locking is eventually done...
fcntl? I don't have that. win32con? Don't have it either... Now in filelock there is this:
self.fd = os.open(self.lockfile, os.O_CREAT|os.O_EXCL|os.O_RDWR)
When I do that on a file it moans that the file exists!! Ehhm ... yea! That's the idea! But even when I do it on a non-existing path. I can still open(path, 'a') it! Even from another python instance...
I'm beginning to think that I fail to understand something very basic here. Am I looking for the wrong thing? Can someone point me into the right direction?
Thanks!
You are trying to implement the file locking problem using just the system call open(). The Unix-like systems uses by default advisory file locking. This means that cooperating processes may use locks to coordinate access to a file among themselves, but uncooperative processes are also free to ignore locks and access the file in any way they choose. In other words, file locks lock out other file lockers only, not I/O. See Wikipedia.
As stated in system call open() reference the solution for performing atomic file locking using a lockfile is to create a unique file on the same file system (e.g., incorporating hostname and pid), use link(2) to make a link to the lockfile. If link() returns 0, the lock is successful. Otherwise, use stat(2) on the unique file to check if its link count has increased to 2, in which case the lock is also successful.
That is why in filelock they also use the function fcntl.flock() and puts all that stuff in a module as it should be.
Alright! Thanks to those guys I actually have something now! So this is my function:
def lock_test(path):
"""
Checks if a file can, aside from it's permissions, be changed right now (True)
or is already locked by another process (False).
:param str path: file to be checked
:rtype: bool
"""
import msvcrt
try:
fd = os.open(path, os.O_APPEND | os.O_EXCL | os.O_RDWR)
except OSError:
return False
try:
msvcrt.locking(fd, msvcrt.LK_NBLCK, 1)
msvcrt.locking(fd, msvcrt.LK_UNLCK, 1)
os.close(fd)
return True
except (OSError, IOError):
os.close(fd)
return False
And the unittest could look something like this:
class Test(unittest.TestCase):
def test_lock_test(self):
testfile = 'some_test_name4142351345.xyz'
testcontent = 'some random blaaa'
with open(testfile, 'w') as fob:
fob.write(testcontent)
# test successful locking and unlocking
self.assertTrue(lock_test(testfile))
os.remove(testfile)
self.assertFalse(os.path.exists(testfile))
# make file again, lock and test False locking
with open(testfile, 'w') as fob:
fob.write(testcontent)
fd = os.open(testfile, os.O_APPEND | os.O_RDWR)
msvcrt.locking(fd, msvcrt.LK_NBLCK, 1)
self.assertFalse(lock_test(testfile))
msvcrt.locking(fd, msvcrt.LK_UNLCK, 1)
self.assertTrue(lock_test(testfile))
os.close(fd)
with open(testfile) as fob:
content = fob.read()
self.assertTrue(content == testcontent)
os.remove(testfile)
Works. Downsides are:
It's kind of testing itself with itself
so the initial OSError catch is not even tested, only locking again with msvcrt
But I dunno how to make it better now.
So, I've been able to use multiprocessing to upload multiple files at once to a given server with the following two functions:
import ftplib,multiprocessing,subprocess
def upload(t):
server=locker.server,user=locker.user,password=locker.password,service=locker.service #These all just return strings representing the various fields I will need.
ftp=ftplib.FTP(server)
ftp.login(user=user,passwd=password,acct="")
ftp.storbinary("STOR "+t.split('/')[-1], open(t,"rb"))
ftp.close() # Doesn't seem to be necessary, same thing happens whether I close this or not
def ftp_upload(t=files,server=locker.server,user=locker.user,password=locker.password,service=locker.service):
parsed_targets=parse_it(t)
ftp=ftplib.FTP(server)
ftp.login(user=user,passwd=password,acct="")
remote_files=ftp.nlst(".")
ftp.close()
files_already_on_server=[f for f in t if f.split("/")[-1] in remote_files]
files_to_upload=[f for f in t if not f in files_already_on_server]
connections_to_make=3 #The maximum connections allowed the the server is 5, and this error will pop up even if I use 1
pool=multiprocessing.Pool(processes=connections_to_make)
pool.map(upload,files_to_upload)
My problem is that I (very regularly) end up getting errors such as:
File "/usr/lib/python2.7/multiprocessing/pool.py", line 227, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/lib/python2.7/multiprocessing/pool.py", line 528, in get
raise self._value
ftplib.error_temp: 421 Too many connections (5) from this IP
Note: There's also a timeout error that occasionally occurs, but I'm waiting for it to rear it's ugly head again, at which point I'll post it.
I don't get this error when I use the command line (i.e. "ftp -inv", "open SERVER", "user USERNAME PASSWORD", "mput *.rar"), even when I have (for example) 3 instances of this running at once.
I've read through the ftplib and multiprocessing documentation, and I can't figure out what it is that is causing these errors. This is somewhat of a problem because I'm regularly backing up a large amount of data and a large number of files.
Is there some way I can avoid these errors or is there a different way of having the/a script do this?
Is there a way I can tell the script that if it has this error, it should wait for a second, and then resume it's work?
Is there a way I can have the script upload the files in the same order they are in the list (of course speed differences would mean they wouldn't all always be 4 consecutive files, but at the moment the order seems basically random)?
Can someone explain why/how more connections are being simultaneously made to this server than the script is calling for?
So, just handling the exceptions seems to be working (except for the occasional recursion error...still have no fucking idea what the hell is going on there).
As per #3, I wasn't looking for that to be 100% in order, only that the script would pick the next file in the list to upload (so differences in processes speeds could/would still cause the order not to be completely sequential, there would be less variability than in the current system, which seems to be almost unordered).
You could try to use a single ftp instance per process:
def init(*credentials):
global ftp
server, user, password, acct = credentials
ftp = ftplib.FTP(server)
ftp.login(user=user, passwd=password, acct=acct)
def upload(path):
with open(path, 'rb') as file:
try:
ftp.storbinary("STOR " + os.path.basename(path), file)
except ftplib.error_temp as error: # handle temporary error
return path, error
else:
return path, None
def main():
# ...
pool = multiprocessing.Pool(processes=connections_to_make,
initializer=init, initargs=credentials)
for path, error in pool.imap_unordered(upload, files_to_upload):
if error is not None:
print("failed to upload %s" % (path,))
specifically answering (2) Is there a way I can tell the script that if it has this error, it should wait for a second, and then resume it's work?
Yes.
ftplib.error_temp: 421 Too many connections (5) from this IP
This is an exception. You can catch it and handle it. While python doesn't support tail calls, so this is terrible form, it can be as simple as this:
def upload(t):
server=locker.server,user=locker.user,password=locker.password,service=locker.service #These all just return strings representing the various fields I will need.
try:
ftp=ftplib.FTP(server)
ftp.login(user=user,passwd=password,acct="")
ftp.storbinary("STOR "+t.split('/')[-1], open(t,"rb"))
ftp.close() # Doesn't seem to be necessary, same thing happens whether I close this or not
except ftplib.error_temp:
ftp.close()
sleep(2)
upload(t)
As for your question (3) if that is what you want, do the upload serially, not in parallel.
I look forward to you posting with an update with an answer to (4). The only thing which comes to my mind is some other process with ftp connection to this IP.
I have a Python HTTP server, on a certain GET request a file is created which is returned as response afterwards. The file creation might take a second, respectively the modification (updating) of the file.
Hence, I cannot return immediately the file as response. How do I approach such a problem? Currently I have a solution like this:
while not os.path.isfile('myfile'):
time.sleep(0.1)
return myfile
This seems very inconvenient, but is there a possibly better way?
A simple notification would do, but I don't have control over the process which creates/updates the files.
You could use Watchdog for a nicer way to watch the file system?
Something like this will remove the os call:
while updating:
time.sleep(0.1)
return myfile
...
def updateFile():
# updating file
updating = false
Implementing blocking io operations in synchronous HTTP requests is a bad approach. If many people run the same procedure simultaneously you may soon run out of threads (if there is a limited thread pool). I'd do the following:
A client requests the file creation URI. A file generating procedure is initialized in a background process (some asynchronous task system), the user gets a file id / name in the HTTP response. Next the client makes AJAX calls every once a while (polling), to check if the file has been created/modified (seperate file serve/check-if-exists URI). When the file is finaly created, the user is redirected (js window.location) to the file serving URI.
This approach will require a bit more work, but eventually it will pay off.
You can try using os.path.getmtime, this would check the modification time of the file and return if it's less than 1 sec ago. Also I suggest you only make a limited amount of tries or you will be stuck in an infinite loop if the file doesn't get created/modified. And as #Krzysztof RosiĆski pointed out you should probably think about doing it in a non-blocking way.
import os
from datetime import datetime
import time
for i in range(10):
try:
dif = datetime.now()-datetime.fromtimestamp(os.path.getmtime(file_path))
if dif.total_seconds() < 1:
return file
except OSError:
time.sleep(0.1)
I have a question about shared resource with file handle between processes.
Here is my test code:
from multiprocessing import Process,Lock,freeze_support,Queue
import tempfile
#from cStringIO import StringIO
class File():
def __init__(self):
self.temp = tempfile.TemporaryFile()
#print self.temp
def read(self):
print "reading!!!"
s = "huanghao is a good boy !!"
print >> self.temp,s
self.temp.seek(0,0)
f_content = self.temp.read()
print f_content
class MyProcess(Process):
def __init__(self,queue,*args,**kwargs):
Process.__init__(self,*args,**kwargs)
self.queue = queue
def run(self):
print "ready to get the file object"
self.queue.get().read()
print "file object got"
file.read()
if __name__ == "__main__":
freeze_support()
queue = Queue()
file = File()
queue.put(file)
print "file just put"
p = MyProcess(queue)
p.start()
Then I get a KeyError like below:
file just put
ready to get the file object
Process MyProcess-1:
Traceback (most recent call last):
File "D:\Python26\lib\multiprocessing\process.py", line 231, in _bootstrap
self.run()
File "E:\tmp\mpt.py", line 35, in run
self.queue.get().read()
File "D:\Python26\lib\multiprocessing\queues.py", line 91, in get
res = self._recv()
File "D:\Python26\lib\tempfile.py", line 375, in __getattr__
file = self.__dict__['file']
KeyError: 'file'
I think when I put the File() object into queue , the object got serialized, and file handle can not be serialized, so, i got the KeyError:
Anyone have any idea about that? if I want to share objects with file handle attribute, what should I do?
I have to object (at length, won't just fit in a commentl;-) to #Mark's repeated assertion that file handles just can't be "passed around between running processes" -- this is simply not true in real, modern operating systems, such as, oh, say, Unix (free BSD variants, MacOSX, and Linux, included -- hmmm, I wonder what OS's are left out of this list...?-) -- sendmsg of course can do it (on a "Unix socket", by using the SCM_RIGHTS flag).
Now the poor, valuable multiprocessing is fully right to not exploit this feature (even assuming there might be black magic to implement it on Windows too) -- most developers would no doubt misuse it anyway (having multiple processes access the same open file concurrently and running into race conditions). The only proper way to use it is for a process which has exclusive rights to open certain files to pass the opened file handles to another process which runs with reduced privileges -- and then never use that handle itself again. No way to enforce that in the multiprocessing module, anyway.
Back to #Andy's original question, unless he's going to work on Linux only (AND with local processes only, too) and willing to play dirty tricks with the /proc filesystem, he's going to have to define his application-level needs more sharply and serialize file objects accordingly. Most files have a path (or can be made to have one: path-less files are pretty rare, actually non-existent on Windows I believe) and thus can be serialized via it -- many others are small enough to serialize by sending their content over -- etc, etc.