Can anyone explain this weird behaviour of shutil.rmtree and shutil.copytree?

Can anyone explain this weird behaviour of shutil.rmtree and shutil.copytree? - python

I'm building a relatively simple application that asks for directories, checks if they're correct, and then removes one of them and recreates it with contents of the other one. I'm encountering this weird behaviour, I'll try to explain:
When I've got the destination folder window open, AND it's empty, there's an access denied exception, then I get kicked out of the folder and it gets removed. But then if it's not empty, it works just fine, no exceptions, the destination directory (from what it seems) gets emptied then filled with files from the source directory. Which is strange because it's supposed to straight out remove the destination folder no matter what, and then recreate it with the same name and contents from the source destination.
This doesn't make sense to me, shouldn't there be the exact same exception when I'm browsing the directory when it's not empty as when it's empty? What's the difference, it's still supposed to just delete the folder. Is there any logical explanation to that? Also, if there's an exception, why does the directory get removed anyway?
Code for this particular part is pretty straightforward (please keep in mind I'm a beginner :) )
def Delete(self, dest):
try:
shutil.rmtree(dest)
self.Paste(self.src, dest)
except (IOError, os.error) as e:
print e
def Paste(self, src, dest):
try:
shutil.copytree(src, dest)
except (IOError, os.error) as e:
print e

This is expected behaviour on Windows.
Internally shutil.rmtree calls the windows API function DeleteFile which is documented on MSDN (http://msdn.microsoft.com/en-us/library/windows/desktop/aa363915%28v=vs.85%29.aspx).
This function has the following property (highlight by me):
The DeleteFile function marks a file for deletion on close. Therefore, the file deletion does not occur until the last handle to the file is closed. Subsequent calls to CreateFile to open the file fail with ERROR_ACCESS_DENIED.
If any other process still has a handle open (e.g. virus scanners, windows explorer because you watch the directory or anything else that might still have a handle to that directory), it will not go away.
Usually you just catch the exception in your Paste operation and retry it a few times/for some dozend milliseconds, to handle all those weird virus scanner anomalies.
Little bonus: You can use windbg or ProcessExplorer to find out who still keeps an open handle to your file (just use Find Handle in Process explorer and search for the filename).

Related

Thing to check if you have permissions to directory

I'm writting an python program and now I'm working at exceptions.
while True:
try:
os.makedirs("{}\\test".format(dest))
except PermissionError:
print("Make sure that you have access to specified path")
print("Try again specify your path: ", end='')
dest = input()
continue
break
It is working but later I need to delete that folder.
What is the better way to do it?

Don't.
It is almost never worth verifying that you have permissions to perform an operation that your program requires. For one thing, permissions are not the only possible reason for failure. A delete may also fail because of a file lock by another program, for instance. Unless you have a very good reason to do otherwise, it is both more efficient and more reliable to just write your code to try the operation and then abort on failure:
import shutil
try:
shutil.rmtree(path_to_remove) # Recursively deletes directory and files inside it
except Exception as ex:
print('Failed to delete directory, manual clean up may be required: {}'.format(path_to_remove))
sys.exit(1)
Other concerns about your code
Use os.path.join to concatenate file paths: os.makedirs(os.path.join(dest, test)). This will use the appropriate directory separator for the operating system.
Why are you looping on failure? In real world programs, simply aborting the entire operation is simpler and usually makes for a better user experience.
Are you sure you aren't looking for the tempfile library? It allows you to spit out a unique directory to the operating system's standard temporary location:
import tempfile
with tempfile.TemporaryDirectory() as tmpdir:
some_function_that_creates_several_files(tmpdir)
for f in os.walk(tmpdir):
# do something with each file
# tmpdir automatically deleted when context manager exits
# Or if you really only need the file
with tempfile.TemporaryFile() as tmpfile:
tmpfile.write('my data')
some_function_that_needs_a_file(tmpfile)
# tmpfile automatically deleted when context manager exits

I think what you want is os.access.
os.access(path, mode, *, dir_fd=None, effective_ids=False, follow_symlinks=True)
Use the real uid/gid to test for access to path. Note that most operations will use the effective uid/gid, therefore this routine can be used in a suid/sgid environment to test if the invoking user has the specified access to path. mode should be F_OK to test the existence of path, or it can be the inclusive OR of one or more of R_OK, W_OK, and X_OK to test permissions. Return True if access is allowed, False if not.
For example:
os.access("/path", os.R_OK)
And the mode contains:
os.F_OK # existence
os.R_OK # readability
os.W_OK # writability
os.X_OK # executability
Refer: https://docs.python.org/3.7/library/os.html#os.access

List of exceptions for Python3 I/O operations for proper error handling

I'm trying to write to a file like this:
with open(filename), 'wb') as f:
f.write(data)
But when opening a file and writing to it many things might go wrong and I want to properly report what went wrong to the user. So I don't want to do:
try:
with open(filename), 'wb') as f:
f.write(data)
except:
print("something went wrong")
But instead I want to tell the user more specifics so that they can fix the problem on their end. So I started with catching the PermissionError to tell the user to fix their permissions. But soon I got a bugreport that the program would fail if filename pointed to a directory, so I also caught and reported the IsADirectoryError. But this is not all - it might be that the path to filename doesn't exist, so later I also added the FileNotFoundError.
Opening a file and writing to it is a very simple pattern but still many things can go wrong. I'm wondering whether there is an easy recipe that allows me to catch these errors and report them properly without giving a generic "some error happened" message. This kind of pattern has been used for decades, so there must be some list of things to check for, no?
I now caught a number of exceptions but I can imagine that even more are lurking around. Things probably become even worse when running this on non-Linux platforms...
What are the best practices for opening a file and writing to it in Python that give a maximum of information back to the user when things go wrong without ressorting to dumping the bare traceback?
On the operating system level there are only so many things that can go wrong when doing an open() and write() system call so there must exist a complete list?

You could catch the higher level exception OSError and then present the value of the strerror field to the user as the explanation. This relies on the fact that exceptions such as FileNotFoundError are subclasses of OSError and these subclasses can all be caught by using the base class.
Here's an example:
try:
filename = 'missing_file'
with open(filename) as f:
for line in f:
# whatever....
pass
except OSError as exc:
print('{!r}: {}'.format(filename, exc.strerror))
Running this code when missing_file does not exist produces this error message:
'missing_file': No such file or directory

In Python, how do you identify a directory where the name includes a period?

Under Windows, suppose you have the following directory:
r:\glew-1.10.0
What Python method can be used to reliably identify that this is a directory?
os.path.isdir("r:\\glew-1.10.0") # returns false
os.path.isfile("r:\\glew-1.10.0") # returns true
mode = os.stat("r:\\glew-1.10.0").st_mode
stat.S_ISDIR(mode) # raises a WindowsError - the system cannot find the file
It seems clear that the presence of a period in the directory name is causing the problem with these functions, but I can't find any alternatives.

Answering my own ineptness...
First, the concrete:
isdir(path)
will simply return False if the path doesn't exist (it won't raise an error).
Then the abstract:
If a search on Stack Exchange or Google doesn't reveal anyone else who mentions any problem similar to the one you have encountered, double-check your code. Unless you are doing something obscure, the chances are good that you just made an error yourself.

Properly handling IOError thrown by logging.config.fileConfig?

This may be an open ended or awkward question, but I find myself running into more and more exception handling concerns where I do not know the "best" approach in handling them.
Python's logging module raises an IOError if you try to configure a FileHandler with a file that does not exist. The module does not handle this exception, but simply raises it. Often times, it is that the path to the file does not exist (and therefore the file does not exist), so we must create the directories along the path if we want to handle the exception and continue.
I want my application to properly handle this error, as every user has asked why we don't make the proper directory for them.
The way I have decided to handle this can be seen below.
done = False
while not done:
try:
# Configure logging based on a config file
# if a filehandler's full path to file does not exist, it raises an IOError
logging.config.fileConfig(filename)
except IOError as e:
if e.args[0] == 2 and e.filename:
# If we catch the IOError, we can see if it is a "does not exist" error
# and try to recover by making the directories
print "Most likely the full path to the file does not exist, so we can try and make it"
fp = e.filename[:e.rfind("/")]
# See http://stackoverflow.com/questions/273192/python-best-way-to-create-directory-if-it-doesnt-exist-for-file-write#273208 for why I don't just leap
if not os.path.exists(fp):
os.makedirs(fp)
else:
print "Most likely some other error...let's just reraise for now"
raise
else:
done = True
I need to loop (or recurse I suppose) since there is N FileHandlers that need to be configured and therefore N IOErrors that need to be raised and corrected for this scenario.
Is this the proper way to do this? Is there a better, more Pythonic way, that I don't know of or may not understand?

This is not something specific to the logging module: in general, Python code does not automatically create intermediate directories for you automatically; you need to do this explicitly using os.makedirs(), typically like this:
if not os.path.exists(dirname):
os.makedirs(dirname)
You can replace the standard FileHandler provided by logging with a subclass which does the checks you need and, when necessary, creates the directory for the logging file using os.makedirs(). Then you can specify this handler in the config file instead of the standard handler.

Assuming it only needs to be done once at the beginning of your app's execution, I would just os.makedirs() all the needed directories without checking for their existence first or even waiting for the logging module to raise an error. If you then you get an error trying to start a logger, you can just handle it the way you likely already did: print an error, disable the logger. You went above and beyond just by trying to create the directory. If the user gave you bogus information, you're no worse off than you are now, and you're better in the vast majority of cases.

test for the file related function in python

I am rather new to software testing. I wonder how to write a unit test for file related functions in python. e.g., if I have a file copy function as follows.
def copy_file (self):
if not os.path.isdir(dest_path):
os.makedirs(dest_path)
try:
shutil.copy2(src_path, dest_path)
except IOError as e:
print e
What should I do about testing above function? what should I assert (directory, file content, exceptions)?

You can rely on shutil.copy2 copying the contents correctly. You only need to test your code. In this case that it creates the directories if they don't exist, and that it swallows IOErrors.
And don't forget to clean up. ;)

I think, you can get some clue from test_shutil itself and see how it is testing the copy functionality. Namely it is moving files and testing if it exists using another module. The difference in behavior of standard shutil.copy to your wrapper is in dealing with destination if it not already exists. In shutil.copy2, if the destination not already exists, then a file is created which is move from the source, in your case, it is not file, but a destination directory is created and you move your source into that. So write tests there destination does not exists and ensure that after your wrapper runs, the destination is still a directory and and it contains the file that shutil moved.

Think about requirements you put on your method, like:
The method shall report an error in case source_dir does not exist or either source or destination director are not accessible
The method shall report create destination_dir if it does not exist and report an error if destination dir cannot be created
The method shall report an error if either source or destination dir have illegal values (null or illegal chars for directories)
The method shall copy all files from source to destination directory
check, if all files have been copied
The method shall replace existing files (maybe)
check, if existing files are replaced
Those are just some thoughts. Don't put any possible requirement and don't test too much, keep in mind what want to do with this method. If it's private or only used in your own code, you can reduce the scope, but if you provide a public API, then you have to make sure that any input has a defined result (which may be an error message).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.