This simple code
# This code will BLANK the file 'myfile'!
with open('myfile', 'w') as file:
raise Exception()
rather than merely throwing an exception, deletes all data in "myfile", although no actual write operation is even attempted.
This is dangerous to say the least, and certainly not how other languages treat such situations.
How I can prevent this? Do I have to handle every possible exception in order to be certain that the target file will not be blanked by some unforeseen condition? Surely there must be a standard pattern to solve this problem. And, above all: What is happening here in the first place?
You are opening a file for writing. It is that simple action that blanks the file, regardless of what else you do with it. From the open() function documentation:
'w'
open for writing, truncating the file first
Emphasis mine. In essence, the file is empty because you didn't write anything to it, not because you opened it.
Postpone opening the file to a point where you actually have data to write if you don't want this to happen. Writing a list of strings to a file is not going to cause exceptions at the Python level.
Alternatively, write to a new file, and rename (move) it afterwards to replace the original. Renaming a file as left to the OS.
The statement open('myfile', 'w') will delete all the contents on execution i.e. truncate the file.
If you want to retain the lines you have to use open('myfile', 'a'). Here the a option is for append.
Opening a file for writing erases the contents. Best way to avoid lost of data, not only in case of exceptions, also computer shutdown, etc. is to create a new temporary file and rename the file to the original name, when everything is done.
yourfile = "myfile"
try:
with tempfile.NamedTemporaryFile(dir=os.path.dirname(yourfile) or '.', delete=False) as output:
do_something()
except Exception:
handle_exception()
else:
os.rename(output.name, yourfile)
Related
I have a python script that takes in two arguments, the name of the input and output files, i.e. it starts of like
inputFile=open(sys.argv[1],'r')
outFile=open(sys.argv[2],'w')
Then performs whatever operation reading from inputFile and writing to the outFile.
Now a few times through human error I've accidentally given the same argument twice, the result being that my input file is replaced with a blank line. Is there are a straight-forward way to stop this happening?
I thought it might be as simple as adding
if sys.argv[1]==sys.argv[2]:
inputFile.close()
outFile.close()
immediately after the first lines above, but this already leaves the file blank.
Simply do :
import os
if os.path.realpath(sys.argv[1]) != os.path.realpath(sys.argv[2]):
inputFile=open(sys.argv[1],'r')
outFile=open(sys.argv[2],'w')
else:
raise ValueError('Input and output files are the same')
This will prevent human mistakes by raising a welcomed error that won't destroy your input file.
os.path.realpath will transform any relative path to an absolute path, so that, even if the strings are different, you can raise the error when absolute paths are identical (thanks #Jean-François Fabre for reminding me this)
opening the file for writing immediately truncates the file, so the damage is already done when you compare the strings.
That said:
on windows filesystems, the protection is "built-in" since if the file is open as read mode, it cannot be open as write mode at the same time: good (there's a "grey area" for networked filesystems, though)
on Linux/Unix, the risk is there. But comparing the name isn't enough. What if both different paths point on the same file after all? (consider: foo/bar and /mydrive/foo/bar or foo/../bar and bar)
You could use os.path.realpath() on both files prior to comparing for instance to resolve relative paths that could be different (that wouldn't solve symbolic link problems, but it's better than nothing)
And for the windows "gray area" I was mentionning, comparing the lowercase version of the names would be a good idea.
The input file is becoming blank because open(filename, 'w') overwrites a file with whatever needs to be placed in it. 'w' is useful for file creation and then writing to that file. I'd suggest trying open(filename, 'a') for appending a pre-existing file. I can't quite remember if this creates a file if it's not already existing, but it sounds like you have 2 existing files already, so append should be what you need.
If you decide to go the if sys.argv[1] == sys.argv[2] method, try placing str() around each item you're comparing, just to be certain it's comparing them properly.
I have a situation where I have a file open using 'with'. I make some edits to the file and save it if the changes are successful. However whenever an error occurs during file handling, I want the file to be close without any changes done to the file. The with seem to overwrite the file and make the file empty.
Here is the code:
with open(path + "\\Config\\"+ filename, 'wb') as configfile:
config.write(configfile)
I get the "a bytes-like object is required, not 'str'" error for the above code which is fine. But all the content from the file has been removed when the error occurs.
How can be explicitly say the code to not save the changes and revert to the content that was existing before the change was made?
I use active python 3.5
If you don't want to make any changes to the original file unless everything is successful, what you should do is write your output to a new file. Then when you're done, rename that file to the original file.
If an error happens, you can use try/except to catch the error and delete the temporary file before exiting.
Open in a different mode than w. Using 'w' will created if it does not exist, otherwise it truncates whatever is in the file already. Use 'a' instead, which does not truncate by default. However, note that the file cursor will be at the end of the file. You you actually want to overwrite if there is no error, you'll have to f.seek(0) then f.truncate() manually.
EDIT
Actually, it might be better to use r+, which will not truncate automatically either, and the stream is at the beginning of the file instead of the end (like it is with 'a'), so only a simple f.truncate() will be necessary. See your options here. Basically, you definitely don't want 'w' but either one of 'r+' or 'a' depending on precisely the behavior you want.
with open(file, 'rb') as readerfile:
reader = csv.reader(readerfile)
In the above syntax, can I perform the first and second line together? It seems unnecessary to use 2 variables ('readerfile' and 'reader' above) if I only need to use the latter.
Is the former variable ('readerfile') ever used?
Can I use the same variable name for both is that bad form?
You can do:
reader = csv.reader(open(file, 'rb'))
but that would mean you are not closing your file explicitly.
with open(file, 'rb') as readerfile:
The first line opens the file and stores the file object in readerfile. The with statement ensures that the file is closed when you exit the block by any means, including exceptions.
reader = csv.reader(readerfile)
The second line creates a CSV reader object using the file object. It needs the file object (otherwise where would it read the data from?). Of course you could conceivably store it in the same variable
readerfile = csv.reader(readerfile)
if you wanted to (and don't plan on using the file object again), but this will likely lead to confusion for readers of your code.
Note that you haven't read anything yet! You still need to iterate over the reader object in order to get the data that you're interested in, and if you close the file before that happens then the reader object won't work. The file object is used behind the scenes by the reader object, even if you "hide" it by overwriting the readerfile variable.
Lastly, if you really want to do everything on one line, you could conceivably define a function that abstracts the with statement:
def with1(context, func):
with context as x:
return func(x)
Now you can write this as one line:
data = with1(open(file, 'rb'), lambda readerfile: list(csv.reader(readerfile)))
It's by no means clearer, however.
This is not recommended at all
Why is it important to use one line?
Most python programmers know well the benefits of using the with statement. Keep in mind that readers might be lazy (that is -read line by line-) on some cases. You want to be able to handle the file with the correct statement, ensuring the correct closing, even if errors arise.
Nevertheless, you can use a one liner for this, as stated in other answers:
reader = csv.reader(open(file, 'rb'))
So basically you want a one-liner?
reader = csv.reader(open(file, 'rb'))
As said before, the problem with that is with open() allows you to do the following steps in one time:
Open the file
Do what you want with the file (inside your open block)
Close the file (that is implicit and you don't have to specify it)
If you don't use with open but directly open, you file stays opened until the object is garbage collected, and that could lead to unpredicted behaviour in some cases.
Plus, your original code (two lines) is much more readable than a one-liner.
If you put them together, then the file won't be closed automatically -- but that often doesn't really matter, since it will be closed automatically when the script terminates.
It's not common to need to reference the raw file once acsv.readerinstance has been created from (except possibly to explicitly close it if you're not using awithstatement).
If you use the same variable name for both, it will probably work because thecsv.readerinstance will still hold a reference to the file object, so it won't be garbage collected until the program ends. It's not a commonly idiom, however.
Since csv files are often processed sequentially, the following can be a fairly concise way to do it since thecsv.readerinstance frequently doesn't really need to be given a variable name and it will close the file properly even if an exception occurs:
with open(file, 'rb') as readerfile:
for row in csv.reader(readerfile):
process the data...
I'm learning PyGTK and I'm making a Text Editor (That seems to be the hello world of pygtk :])
Anyways, I have a "Save" function that writes the TextBuffer to a file. Looks something like
try:
f = open(self.working_file_path, "rw+")
buff = self._get_buffer()
f.write(self._get_text())
#update modified flag
buff.set_modified(False)
f.close()
except IOError as e:
print "File Doesnt Exist so bring up Save As..."
......
Basically, if the file exist, write the buffer to it, if not bring up the Save As Dialog.
My question is: What is the best way to "update" a file. I seem to only be able to append to the end of a file. I've tried various file modes, but I'm sure I'm missing something.
Thanks in advance!
You can open a file in "r+" mode, which allows you to both read and write to the file, and to seek to particular positions and write there. This probably doesn't help you do what I think you want though; it sounds like you're wanting to only write out the changed data?
Remember that on the disk the file isn't stored as a series of extensible lines, it's just a sequence of bytes; some of those bytes indicate line-endings, but the next line follows on immediately. So if you edit the first line in the file and you write the new first line out, unless the new one happens to be exactly the same length as the old one the second line now won't be in the right place, so you'll need to move it (and have taken a copy of it first if the new line you wrote out was longer than the original). And this now means that the next line isn't in the right position either... and so on until you've had to read in and write out the entire rest of the file.
In practice you almost never write only part of an existing file unless you can simply append more data; if you need to "alter" a file you read it in, alter it in memory, and write it back out or you read in the file in pieces (often line by line) and then write out to a new file as you go (and then possibly move the new file over the top of the original). The first approach is easiest, the second is better for not having to hold the whole thing in memory at once.
At the point where you write to the file, your location is at the end of the file, so you need to seek back to the beginning. Then, you will overwrite the file, but this may leave old content at the end, so you also need to truncate the file.
Additionally, the mode you're specifying ('rw+') is invalid, and I get IOErrors when I try to do some operations on files opened with it. I believe that you want mode 'r+' ("Open for reading and writing. The stream is positioned at the beginning of the file."). 'w+' is similar, but would create the file if it didn't exist.
So, what you're looking for might be code like this:
try:
f = open(self.working_file_path, "r+")
buff = self._get_buffer()
f.seek(0)
f.truncate()
f.write(self._get_text())
#update modified flag
buff.set_modified(False)
f.close()
except IOError as e:
print "File Doesnt Exist so bring up Save As..."
......
However, you may want to modify this code to correctly catch and handle errors while truncating and writing the file, rather than assuming that all IOErrors in this section are non-existant-file errors from the call to open.
Read the file in as a list, add an element to the start of it, write it all out. Something like this.
f = open(self.working_file_path, "r+")
flist = f.readlines()
flist.insert(0, self._get_text())
f.seek(0)
f.writelines(flist)
I am just beginning with python with lpthw and had a specific question for closing a file.
I can open a file with:
input = open(from_file)
indata = input.read()
#Do something
indata.close()
However, if I try to simplify the code into a single line:
indata = open(from_file).read()
How do I close the file I opened, or is it already automatically closed?
Thanks in advance for the help!
You simply have to use more than one line; however, a more pythonic way to do it would be:
with open(path_to_file, 'r') as f:
contents = f.read()
Note that with what you are doing before, you could miss closing the file if an exception was thrown. The 'with' statement here will cause it be closed even if an exception is propagated out of the 'with' block.
Files are automatically closed when the relevant variable is no longer referenced. It is taken care of by Python garbage collection.
In this case, the call to open() creates a File object, of which the read() method is run. After the method is executed, no reference to it exists and it is closed (at least by the end of script execution).
Although this works, it is not good practice. It is always better to explicitly close a file, or (even better) to follow the with suggestion of the other answer.