I have text file which I want to erase in Python. How do I do that?
In python:
open('file.txt', 'w').close()
Or alternatively, if you have already an opened file:
f = open('file.txt', 'r+')
f.truncate(0) # need '0' when using r+
Opening a file in "write" mode clears it, you don't specifically have to write to it:
open("filename", "w").close()
(you should close it as the timing of when the file gets closed automatically may be implementation specific)
Not a complete answer more of an extension to ondra's answer
When using truncate() ( my preferred method ) make sure your cursor is at the required position.
When a new file is opened for reading - open('FILE_NAME','r') it's cursor is at 0 by default.
But if you have parsed the file within your code, make sure to point at the beginning of the file again i.e truncate(0)
By default truncate() truncates the contents of a file starting from the current cusror position.
A simple example
As #jamylak suggested, a good alternative that includes the benefits of context managers is:
with open('filename.txt', 'w'):
pass
When using with open("myfile.txt", "r+") as my_file:, I get strange zeros in myfile.txt, especially since I am reading the file first. For it to work, I had to first change the pointer of my_file to the beginning of the file with my_file.seek(0). Then I could do my_file.truncate() to clear the file.
Writing and Reading file content
def writeTempFile(text = None):
filePath = "/temp/file1.txt"
if not text: # If not provided return file content
f = open(filePath, "r")
slug = f.read()
return slug
else:
f = open(filePath, "a") # Create a blank file
f.seek(0) # sets point at the beginning of the file
f.truncate() # Clear previous content
f.write(text) # Write file
f.close() # Close file
return text
It Worked for me
If security is important to you then opening the file for writing and closing it again will not be enough. At least some of the information will still be on the storage device and could be found, for example, by using a disc recovery utility.
Suppose, for example, the file you're erasing contains production passwords and needs to be deleted immediately after the present operation is complete.
Zero-filling the file once you've finished using it helps ensure the sensitive information is destroyed.
On a recent project we used the following code, which works well for small text files. It overwrites the existing contents with lines of zeros.
import os
def destroy_password_file(password_filename):
with open(password_filename) as password_file:
text = password_file.read()
lentext = len(text)
zero_fill_line_length = 40
zero_fill = ['0' * zero_fill_line_length
for _
in range(lentext // zero_fill_line_length + 1)]
zero_fill = os.linesep.join(zero_fill)
with open(password_filename, 'w') as password_file:
password_file.write(zero_fill)
Note that zero-filling will not guarantee your security. If you're really concerned, you'd be best to zero-fill and use a specialist utility like File Shredder or CCleaner to wipe clean the 'empty' space on your drive.
You have to overwrite the file. In C++:
#include <fstream>
std::ofstream("test.txt", std::ios::out).close();
You can also use this (based on a few of the above answers):
file = open('filename.txt', 'w')
file.close()
of course this is a really bad way to clear a file because it requires so many lines of code, but I just wrote this to show you that it can be done in this method too.
happy coding!
You cannot "erase" from a file in-place unless you need to erase the end. Either be content with an overwrite of an "empty" value, or read the parts of the file you care about and write it to another file.
Assigning the file pointer to null inside your program will just get rid of that reference to the file. The file's still there. I think the remove() function in the c stdio.h is what you're looking for there. Not sure about Python.
Since text files are sequential, you can't directly erase data on them. Your options are:
The most common way is to create a new file. Read from the original file and write everything on the new file, except the part you want to erase. When all the file has been written, delete the old file and rename the new file so it has the original name.
You can also truncate and rewrite the entire file from the point you want to change onwards. Seek to point you want to change, and read the rest of file to memory. Seek back to the same point, truncate the file, and write back the contents without the part you want to erase.
Another simple option is to overwrite the data with another data of same length. For that, seek to the exact position and write the new data. The limitation is that it must have exact same length.
Look at the seek/truncate function/method to implement any of the ideas above. Both Python and C have those functions.
This is my method:
open the file using r+ mode
read current data from the file using file.read()
move the pointer to the first line using file.seek(0)
remove old data from the file using file.truncate(0)
write new content and then content that we saved using file.read()
So full code will look like this:
with open(file_name, 'r+') as file:
old_data = file.read()
file.seek(0)
file.truncate(0)
file.write('my new content\n')
file.write(old_data)
Because we are using with open, file will automatically close.
Related
'r' will read a file, 'w' will write text in the file from the start, and 'a' will append. How can I open the file to read and append at the same time?
I tried these, but got errors:
open("filename", "r,a")
open("filename", "w")
open("filename", "r")
open("filename", "a")
error:
invalid mode: 'r,a'
You're looking for the r+/a+/w+ mode, which allows both read and write operations to files.
With r+, the position is initially at the beginning, but reading it once will push it towards the end, allowing you to append. With a+, the position is initially at the end.
with open("filename", "r+") as f:
# here, position is initially at the beginning
text = f.read()
# after reading, the position is pushed toward the end
f.write("stuff to append")
with open("filename", "a+") as f:
# here, position is already at the end
f.write("stuff to append")
If you ever need to do an entire reread, you could return to the starting position by doing f.seek(0).
with open("filename", "r+") as f:
text = f.read()
f.write("stuff to append")
f.seek(0) # return to the top of the file
text = f.read()
assert text.endswith("stuff to append")
(Further Reading: What's the difference between 'r+' and 'a+' when open file in python?)
You can also use w+, but this will truncate (delete) all the existing content.
Here's a nice little diagram from another SO post:
(source)
You can't do that with a textfile. Either you want to read it or you want to write to it. The a or the r specifies a seek to a particular location in the file. Specifying both is asking open to point to two different locations in the file at the same time.
Textfiles in general can't be updated in place. You can use a to add new stuff to the end but that is about it. To do what I think you want, you need to open the existing file in read mode, and open another, new file in write mode, and copy the data from the one to the other.
After that you have two files so you have to take care of deleting the old one. If that is troublesome, take a look at the module in-place.
The other alternative is to read the input file into memory, close and reopen it for writing, then write out a new version of the file. Then you don't have to delete the old copy. But if something goes wrong in the middle you will have no old input file, because you deleted it, and no new output file either, because you didn't successfully write it.
The reason for this is that textfiles are not designed for random access.
im trying to add the same text at the beggining of all the txt files that are in a folder.
With this code i can do it, but there is a problem, i dont know why it overwrite part of the text that is at the beginning of each txt file.
output_dir = "output"
if not os.path.exists(output_dir):
os.makedirs(output_dir)
for f in glob.glob("*.txt"):
with open(f, 'r', encoding="utf8") as inputfile:
with open('%s/%s' % (output_dir, ntpath.basename(f)), 'w', encoding="utf8") as outputfile:
for line in inputfile:
outputfile.write(line.replace(line,"more_text"+line+"text_that_is_overwrited"))
outputfile.seek(0,io.SEEK_SET)
outputfile.write('text_that_overwrite')
outputfile.seek(0, io.SEEK_END)
outputfile.write("more_text")
The content of txt files that im trying to edit start with this:
here 4 spaces text_line_1
here 4 spaces text_line_2
The result is:
On file1.txt: text_that_overwriteited
On file1.txt: text_that_overwriterited
Your mental model of how writing a file works seems to be at odds with what's actually happening here.
If you seek back to the beginning of the file, you will start overwriting all of the file. There is no such thing as writing into the middle of a file. A file - at the level of abstraction where you have open and write calls - is just a stream; seeking back to the beginning of the stream (or generally, seeking to a specific position in the stream) and writing replaces everything which was at that place in the stream before.
Granted, there is a lower level where you could actually write new bytes into a block on the disk whilst that block still remains the storage for a file which can then be read as a stream. With most modern file systems, the only way to make this work is to replace that block with exactly the same amount of data, which is very rarely feasible. In other words, you can't replace a block containing 1024 bytes with data which isn't also exactly 1024 bytes. This is so marginally useful that it's simply not an operation which is exposed to the higher level of the file system.
With that out of the way, the proper way to "replace lines" is to not write those lines at all. Instead, write the replacement, followed by whichever lines were in the original file.
It's not clear from your question what exactly you want overwritten, so this is just a sketch with some guesses around that part.
output_dir = "output"
# prefer exist_ok=True over if not os.path.exists()
os.makedirs(output_dir, exist_ok=True)
for f in glob.glob("*.txt"):
# use a single with statement
# prefer os.path.basename over ntpath.basename; use os.path.join
with open(f, 'r', encoding="utf8") as inputfile, \
open(os.path.join(output_dir, os.path.basename(f)), 'w', encoding="utf8") as outputfile:
for idx, line in enumerate(inputfile):
if idx == 0:
outputfile.write("more text")
outputfile.write(line.rstrip('\n'))
outputfile.write("text that is overwritten\n")
continue
# else:
outputfile.write(line)
outputfile.write("more_text\n")
Given an input file like
here is some text
here is some more text
this will create an output file like
more texthere is some texttext that is overwritten
here is some more text
more_text
where the first line is a modified version of the original first line, and a new line is appended after the original file's contents.
I found this elsewhere on StackOverflow. Why does my text file keep overwriting the data on it?
Essentially, the w mode is meant to overwrite text.
Also, you seem to be writing a sitemap manually. If you are using a web framework like Flask or Django, they have plugin or built-in support for auto-generated sitemaps — you should use that instead. Alternatively, you could create an XML template for the sitemap using Jinja or DTL. Templates are not just for HTML files.
Started Python a week ago and I have some questions to ask about reading and writing to the same files. I've gone through some tutorials online but I am still confused about it. I can understand simple read and write files.
openFile = open("filepath", "r")
readFile = openFile.read()
print readFile
openFile = open("filepath", "a")
appendFile = openFile.write("\nTest 123")
openFile.close()
But, if I try the following I get a bunch of unknown text in the text file I am writing to. Can anyone explain why I am getting such errors and why I cannot use the same openFile object the way shown below.
# I get an error when I use the codes below:
openFile = open("filepath", "r+")
writeFile = openFile.write("Test abc")
readFile = openFile.read()
print readFile
openFile.close()
I will try to clarify my problems. In the example above, openFile is the object used to open file. I have no problems if I want write to it the first time. If I want to use the same openFile to read files or append something to it. It doesn't happen or an error is given. I have to declare the same/different open file object before I can perform another read/write action to the same file.
#I have no problems if I do this:
openFile = open("filepath", "r+")
writeFile = openFile.write("Test abc")
openFile2 = open("filepath", "r+")
readFile = openFile2.read()
print readFile
openFile.close()
I will be grateful if anyone can tell me what I did wrong here or is it just a Pythong thing. I am using Python 2.7. Thanks!
Updated Response:
This seems like a bug specific to Windows - http://bugs.python.org/issue1521491.
Quoting from the workaround explained at http://mail.python.org/pipermail/python-bugs-list/2005-August/029886.html
the effect of mixing reads with writes on a file open for update is
entirely undefined unless a file-positioning operation occurs between
them (for example, a seek()). I can't guess what
you expect to happen, but seems most likely that what you
intend could be obtained reliably by inserting
fp.seek(fp.tell())
between read() and your write().
My original response demonstrates how reading/writing on the same file opened for appending works. It is apparently not true if you are using Windows.
Original Response:
In 'r+' mode, using write method will write the string object to the file based on where the pointer is. In your case, it will append the string "Test abc" to the start of the file. See an example below:
>>> f=open("a","r+")
>>> f.read()
'Test abc\nfasdfafasdfa\nsdfgsd\n'
>>> f.write("foooooooooooooo")
>>> f.close()
>>> f=open("a","r+")
>>> f.read()
'Test abc\nfasdfafasdfa\nsdfgsd\nfoooooooooooooo'
The string "foooooooooooooo" got appended at the end of the file since the pointer was already at the end of the file.
Are you on a system that differentiates between binary and text files? You might want to use 'rb+' as a mode in that case.
Append 'b' to the mode to open the file in binary mode, on systems
that differentiate between binary and text files; on systems that
don’t have this distinction, adding the 'b' has no effect.
http://docs.python.org/2/library/functions.html#open
Every open file has an implicit pointer which indicates where data will be read and written. Normally this defaults to the start of the file, but if you use a mode of a (append) then it defaults to the end of the file. It's also worth noting that the w mode will truncate your file (i.e. delete all the contents) even if you add + to the mode.
Whenever you read or write N characters, the read/write pointer will move forward that amount within the file. I find it helps to think of this like an old cassette tape, if you remember those. So, if you executed the following code:
fd = open("testfile.txt", "w+")
fd.write("This is a test file.\n")
fd.close()
fd = open("testfile.txt", "r+")
print fd.read(4)
fd.write(" IS")
fd.close()
... It should end up printing This and then leaving the file content as This IS a test file.. This is because the initial read(4) returns the first 4 characters of the file, because the pointer is at the start of the file. It leaves the pointer at the space character just after This, so the following write(" IS") overwrites the next three characters with a space (the same as is already there) followed by IS, replacing the existing is.
You can use the seek() method of the file to jump to a specific point. After the example above, if you executed the following:
fd = open("testfile.txt", "r+")
fd.seek(10)
fd.write("TEST")
fd.close()
... Then you'll find that the file now contains This IS a TEST file..
All this applies on Unix systems, and you can test those examples to make sure. However, I've had problems mixing read() and write() on Windows systems. For example, when I execute that first example on my Windows machine then it correctly prints This, but when I check the file afterwards the write() has been completely ignored. However, the second example (using seek()) seems to work fine on Windows.
In summary, if you want to read/write from the middle of a file in Windows I'd suggest always using an explicit seek() instead of relying on the position of the read/write pointer. If you're doing only reads or only writes then it's pretty safe.
One final point - if you're specifying paths on Windows as literal strings, remember to escape your backslashes:
fd = open("C:\\Users\\johndoe\\Desktop\\testfile.txt", "r+")
Or you can use raw strings by putting an r at the start:
fd = open(r"C:\Users\johndoe\Desktop\testfile.txt", "r+")
Or the most portable option is to use os.path.join():
fd = open(os.path.join("C:\\", "Users", "johndoe", "Desktop", "testfile.txt"), "r+")
You can find more information about file IO in the official Python docs.
Reading and Writing happens where the current file pointer is and it advances with each read/write.
In your particular case, writing to the openFile, causes the file-pointer to point to the end of file. Trying to read from the end would result EOF.
You need to reset the file pointer, to point to the beginning of the file before through seek(0) before reading from it
You can read, modify and save to the same file in python but you have actually to replace the whole content in file, and to call before updating file content:
# set the pointer to the beginning of the file in order to rewrite the content
edit_file.seek(0)
I needed a function to go through all subdirectories of folder and edit content of the files based on some criteria, if it helps:
new_file_content = ""
for directories, subdirectories, files in os.walk(folder_path):
for file_name in files:
file_path = os.path.join(directories, file_name)
# open file for reading and writing
with io.open(file_path, "r+", encoding="utf-8") as edit_file:
for current_line in edit_file:
if condition in current_line:
# update current line
current_line = current_line.replace('john', 'jack')
new_file_content += current_line
# set the pointer to the beginning of the file in order to rewrite the content
edit_file.seek(0)
# delete actual file content
edit_file.truncate()
# rewrite updated file content
edit_file.write(new_file_content)
# empties new content in order to set for next iteration
new_file_content = ""
edit_file.close()
file_handle = open("/var/www/transactions.csv", "a")
c = csv.writer(file_handle);
oldamount = amount / 1.98
file_handle.seek(0);
c.writerow( [addre, oldamount, "win"])
Here is my code
I wish to write [addre, oldamount, "win"]) to the start of my CSV file, however it's not working. It's still going to the bottom.
You are opening the file in append ("a") mode. The documentation for open() points out this behavior explicitly: "all writes append to the end of the file regardless of the current seek position".
It isn't possible to "just insert" text at the beginning of a file like you want to. You can either read the whole file, add your data in the front, and write it back out, or you live with the fact that the data goes at the end.
Example for rewriting:
with open("/var/www/transactions.csv", "r+") as f:
olddata = f.read()
f.seek(0)
c = csv.writer(f);
c.writerow([addre, oldamount, "win"])
f.write(olddata)
Note that this can corrupt your file if something goes wrong while writing. If you want to minimize that possibility, write to a new file, then os.rename() it to overwrite the old one.
I'm learning PyGTK and I'm making a Text Editor (That seems to be the hello world of pygtk :])
Anyways, I have a "Save" function that writes the TextBuffer to a file. Looks something like
try:
f = open(self.working_file_path, "rw+")
buff = self._get_buffer()
f.write(self._get_text())
#update modified flag
buff.set_modified(False)
f.close()
except IOError as e:
print "File Doesnt Exist so bring up Save As..."
......
Basically, if the file exist, write the buffer to it, if not bring up the Save As Dialog.
My question is: What is the best way to "update" a file. I seem to only be able to append to the end of a file. I've tried various file modes, but I'm sure I'm missing something.
Thanks in advance!
You can open a file in "r+" mode, which allows you to both read and write to the file, and to seek to particular positions and write there. This probably doesn't help you do what I think you want though; it sounds like you're wanting to only write out the changed data?
Remember that on the disk the file isn't stored as a series of extensible lines, it's just a sequence of bytes; some of those bytes indicate line-endings, but the next line follows on immediately. So if you edit the first line in the file and you write the new first line out, unless the new one happens to be exactly the same length as the old one the second line now won't be in the right place, so you'll need to move it (and have taken a copy of it first if the new line you wrote out was longer than the original). And this now means that the next line isn't in the right position either... and so on until you've had to read in and write out the entire rest of the file.
In practice you almost never write only part of an existing file unless you can simply append more data; if you need to "alter" a file you read it in, alter it in memory, and write it back out or you read in the file in pieces (often line by line) and then write out to a new file as you go (and then possibly move the new file over the top of the original). The first approach is easiest, the second is better for not having to hold the whole thing in memory at once.
At the point where you write to the file, your location is at the end of the file, so you need to seek back to the beginning. Then, you will overwrite the file, but this may leave old content at the end, so you also need to truncate the file.
Additionally, the mode you're specifying ('rw+') is invalid, and I get IOErrors when I try to do some operations on files opened with it. I believe that you want mode 'r+' ("Open for reading and writing. The stream is positioned at the beginning of the file."). 'w+' is similar, but would create the file if it didn't exist.
So, what you're looking for might be code like this:
try:
f = open(self.working_file_path, "r+")
buff = self._get_buffer()
f.seek(0)
f.truncate()
f.write(self._get_text())
#update modified flag
buff.set_modified(False)
f.close()
except IOError as e:
print "File Doesnt Exist so bring up Save As..."
......
However, you may want to modify this code to correctly catch and handle errors while truncating and writing the file, rather than assuming that all IOErrors in this section are non-existant-file errors from the call to open.
Read the file in as a list, add an element to the start of it, write it all out. Something like this.
f = open(self.working_file_path, "r+")
flist = f.readlines()
flist.insert(0, self._get_text())
f.seek(0)
f.writelines(flist)