Delete a row from a text file with Python

Delete a row from a text file with Python - python

I have a file where each line starts with a number. The user can delete a row by typing in the number of the row the user would like to delete.
The issue I'm having is setting the mode for opening it. When I use a+, the original content is still there. However, tacked onto the end of the file are the lines that I want to keep. On the other hand, when I use w+, the entire file is deleted. I'm sure there is a better way than opening it with w+ mode, deleting everything, and then re-opening it and appending the lines.
def DeleteToDo(self):
print "Which Item Do You Want To Delete?"
DeleteItem = raw_input(">") #select a line number to delete
print "Are You Sure You Want To Delete Number" + DeleteItem + "(y/n)"
VerifyDelete = str.lower(raw_input(">"))
if VerifyDelete == "y":
FILE = open(ToDo.filename,"a+") #open the file (tried w+ as well, entire file is deleted)
FileLines = FILE.readlines() #read and display the lines
for line in FileLines:
FILE.truncate()
if line[0:1] != DeleteItem: #if the number (first character) of the current line doesn't equal the number to be deleted, re-write that line
FILE.write(line)
else:
print "Nothing Deleted"
This is what a typical file may look like
1. info here
2. more stuff here
3. even more stuff here

When you open a file for writing, you clobber the file (delete its current contents and start a new file). You can find this out by reading documentation for the open() command.
When you open a file for appending, you do not clobber the file. But how can you delete just one line? A file is a sequence of bytes stored on a storage device; there is no way for you to delete one line and have all the other lines automatically "slide down" into new positions on the storage device.
(If your data was stored in a database, you could actually delete just one "row" from the database; but a file is not a database.)
So, the traditional way to solve this: you read from the original file, and you copy it to a new output file. As you copy, you perform any desired edits; for example, you can delete a line simply by not copying that one line; or you can insert a line by writing it in the new file.
Then, once you have successfully written the new file, and successfully closed it, if there is no error, you go ahead and rename the new file back to the same name as the old file (which clobbers the old file).
In Python, your code should be something like this:
import os
# "num_to_delete" was specified by the user earlier.
# I'm assuming that the number to delete is set off from
# the rest of the line with a space.
s_to_delete = str(num_to_delete) + ' '
def want_input_line(line):
return not line.startswith(s_to_delete)
in_fname = "original_input_filename.txt"
out_fname = "temporary_filename.txt"
with open(in_fname) as in_f, open(out_fname, "w") as out_f:
for line in in_f:
if want_input_line(line):
out_f.write(line)
os.rename(out_fname, in_fname)
Note that if you happen to have a file called temporary_filename.txt it will be clobbered by this code. Really we don't care what the filename is, and we can ask Python to make up some unique filename for us, using the tempfile module.
Any recent version of Python will let you use multiple statements in a single with statement, but if you happen to be using Python 2.6 or something you can nest two with statements to get the same effect:
with open(in_fname) as in_f:
with open(out_fname, "w") as out_f:
for line in in_f:
... # do the rest of the code
Also, note that I did not use the .readlines() method to get the input lines, because .readlines() reads the entire contents of the file into memory, all at once, and if the file is very large this will be slow or might not even work. You can simply write a for loop using the "file object" you get back from open(); this will give you one line at a time, and your program will work with even really large files.
EDIT: Note that my answer is assuming that you just want to do one editing step. As #jdi noted in comments for another answer, if you want to allow for "interactive" editing where the user can delete multiple lines, or insert lines, or whatever, then the easiest way is in fact to read all the lines into memory using .readlines(), insert/delete/update/whatever on the resulting list, and then only write out the list to a file a single time when editing is all done.

def DeleteToDo():
print ("Which Item Do You Want To Delete?")
DeleteItem = raw_input(">") #select a line number to delete
print ("Are You Sure You Want To Delete Number" + DeleteItem + "(y/n)")
DeleteItem=int(DeleteItem)
VerifyDelete = str.lower(raw_input(">"))
if VerifyDelete == "y":
FILE = open('data.txt',"r") #open the file (tried w+ as well, entire file is deleted)
lines=[x.strip() for x in FILE if int(x[:x.index('.')])!=DeleteItem] #read all the lines first except the line which matches the line number to be deleted
FILE.close()
FILE = open('data.txt',"w")#open the file again
for x in lines:FILE.write(x+'\n') #write the data to the file
else:
print ("Nothing Deleted")
DeleteToDo()

Instead of writing out all lines one by one to the file, delete the line from memory (to which you read the file using readlines()) and then write the memory back to disk in one shot. That way you will get the result you want, and you won't have to clog the I/O.

You could mmap the file... after haven read the suitable documentation...

You don't need to check for the lines numbers in your file, you can do something like this:
def DeleteToDo(self):
print "Which Item Do You Want To Delete?"
DeleteItem = int(raw_input(">")) - 1
print "Are You Sure You Want To Delete Number" + str(DeleteItem) + "(y/n)"
VerifyDelete = str.lower(raw_input(">"))
if VerifyDelete == "y":
with open(ToDo.filename,"r") as f:
lines = ''.join([a for i,a in enumerate(f) if i != DeleteItem])
with open(ToDo.filename, "w") as f:
f.write(lines)
else:
print "Nothing Deleted"

Related

How can I delete and rewrite a line without unknown characters?

I have a database.txt file the first column is for usernames the second passwords and the rest 5 recovery question and answers alternating. I want to allow the user to be able to change the password of their details, without affecting another users username as they may be the same. I have found a way to delete the previous one and append the new line of modified details to the file. However, the is always a string or unknown characters at the start of the appended line. AND other characters are being changed not the second value in the list. Please help me find a way to avoid this.
https://repl.it/repls/NecessaryBoldButtonsYou can find the code here changing it will affect everyone, so please copy it elsewhere.
https://onlinegdb.com/BJbsn9-cL
I just need the password to be changed on a user input not other strings, the reason for all this code is that when changing a person's password another username could be changed.This is the original file
This is what happens afterwards, the second string in the list of the line which where data[0] = "bye" should only be changed to newpass, not all of the others
'''
import linecache
f = open("database.txt" , "r+")
for loop in range(3):
line = f.readline()
data = line.split(",")
if data[1] == "bye":
print(data[1]) #These are to help me understand what is happening
print(data[0])
b = data[0]
newpass = "Hi"
a = data[1]
fn = 'database.txt'
e = open(fn)
output = []
str="happy"
for line in e:
if not line.startswith(str):
output.append(line)
e.close()
print(output)
e = open(fn, 'w')
e.writelines(output)
e.close()
line1 = linecache.getline("database.txt" ,loop+1)
print(line)
password = True
print("Password Valid\n")
write = (line1.replace(a, newpass))
write = f.write(line1.replace(a, newpass))
f.close()
'''
This is the file in text:
username,password,Recovery1,Answer1,Recovery2,Answer2,Recovery3,Answer3,Recovery4,Answer4,
Recovery5,Answer5,o,o,o,o,o,o,o,o,o,o,
happy,bye,o,o,o,o,o,o,o,o,o,o,
bye,happy,o,o,o,o,o,o,o,o,o,o,
Support is very much appreciated
Feel free to change the code as much as you need to, as it is already a mess
Thanks in Advance

This should be pretty easy. The basic idea is:
open input file for reading
open output file for writing
for each line in input file
if password = "happy"
change user name in line
write line to output file
It should be pretty easy to convert that to python.
From comments, and by examining your code, I get the feeling that you're trying to update a line in-place. That is, it looks like your expectation is that given the file "database.txt" that contains this:
username,password,Recovery1,Answer1,Recovery2,Answer2,Recovery3,Answer3, Recovery4,Answer4,Recovery5,Answer5,
o,o,o,o,o,o,o,o,o,o,
happy,bye,o,o,o,o,o,o,o,o,o,o,
bye,happy,o,o,o,o,o,o,o,o,o,o,
When you make the change, your new "database.txt" will contain this:
username,password,Recovery1,Answer1,Recovery2,Answer2,Recovery3,Answer3, Recovery4,Answer4,Recovery5,Answer5,
o,o,o,o,o,o,o,o,o,o,
happy,Hi,o,o,o,o,o,o,o,o,o,o,
bye,happy,o,o,o,o,o,o,o,o,o,o,
You can do that, but you can't do it in-place. You have to write all the lines of the file, including the changed line, to a new temporary file. Then you can delete the old "database.txt" and rename the temporary file.
You can't update a line in a text file, because if you change the length of the line then you'll either end up with extra space at the end of the line you changed (because the new line has fewer characters than the old line), or you'll overwrite the beginning of the next line (the new line is longer than the old line).
The only other option is to load all of the lines into memory and close the file. Then change the line or lines you want to change, in memory. Finally, open the "database.txt" file for writing and output all of the lines from memory to the file.

Python: read a line and write back to that same line

I am using python to make a template updater for html. I read a line and compare it with the template file to see if there are any changes that needs to be updated. Then I want to write any changes (if there are any) back to the same line I just read from.
Reading the file, my file pointer is positioned now on the next line after a readline(). Is there anyway I can write back to the same line without having to open two file handles for reading and writing?
Here is a code snippet of what I want to do:
cLine = fp.readline()
if cLine != templateLine:
# Here is where I would like to write back to the line I read from
# in cLine

Updating lines in place in text file - very difficult
Many questions in SO are trying to read the file and update it at once.
While this is technically possible, it is very difficult.
(text) files are not organized on disk by lines, but by bytes.
The problem is, that read number of bytes on old lines is very often different from new one, and this mess up the resulting file.
Update by creating a new file
While it sounds inefficient, it is the most effective way from programming point of view.
Just read from file on one side, write to another file on the other side, close the files and copy the content from newly created over the old one.
Or create the file in memory and finally do the writing over the old one after you close the old one.

At the OS level the things are a bit different from how it looks from Python - from Python a file looks almost like a list of strings, with each string having arbitrary length, so it seems to be easy to swap a line for something else without affecting the rest of the lines:
l = ["Hello", "world"]
l[0] = "Good bye"
In reality, though, any file is just a stream of bytes, with strings following each other without any "padding". So you can only overwrite the data in-place if the resulting string has exactly the same length as the source string - otherwise it'll simply overwrite the following lines.
If that is the case (your processing guarantees not to change the length of strings), you can "rewind" the file to the start of the line and overwrite the line with new data. The below script converts all lines in file to uppercase in-place:
def eof(f):
cur_loc = f.tell()
f.seek(0,2)
eof_loc = f.tell()
f.seek(cur_loc, 0)
if cur_loc >= eof_loc:
return True
return False
with open('testfile.txt', 'r+t') as fp:
while True:
last_pos = fp.tell()
line = fp.readline()
new_line = line.upper()
fp.seek(last_pos)
fp.write(new_line)
print "Read %s, Wrote %s" % (line, new_line)
if eof(fp):
break
Somewhat related: Undo a Python file readline() operation so file pointer is back in original state
This approach is only justified when your output lines are guaranteed to have the same length, and when, say, the file you're working with is really huge so you have to modify it in place.
In all other cases it would be much easier and more performant to just build the output in memory and write it back at once. Another option is to write to a temporary file, then delete the original and rename the temporary file so it replaces the original file.

Replacing a line on Python

I'm trying to convert PHP code to Python, and I have problems with replacing lines. Although I find it easier to do using Python, I'm absolutely lost; I can find the line to replace, I can add something to the end of the line, but I can't write the line again on the file.
file = open("cache.ucb", 'rb')
for line in file:
if line.split('~!')[0] == ex[4]:
line += "~!" + mask[0]
line = line.rstrip() + "\n"
# Write on the file here!
Basically, the file uses ~! as a separator, and I read each line. If the first token separated with ~! of the line starts with ex[4], which could be for example Catbuntu, I want to append mask[0], which could be Bousie, on the end of that line. Then I remove the new line characters and add one to the end.
And there's the problem. I want to write the file as it was, but changing only that line. Is that possible?

Assuming you're on python >=2.7, the following should work a treat
original = open(filename)
newfile = []
for line in original:
if line.split('~!')[0] == ex[4]:
line += "~!" + mask[0]
line = line.rstrip() + "\n"
newfile.append(line)
original.close()
amended.open(filename, "w")
amended.writeLines(newfile)
amended.close()
If for whatever reason you are on python 2.6 or lower, replace the second to last line with:
amended.write("".join(newfile))
EDIT: Fixed to replace a mistake copied from the question, factor out a filename.

You cannot modify a file in-place, at least not if you want to insert characters to a line. You'll just end up overwriting the start of the next line.
There are two different ways to do this:
Read the file into memory, close it, then write back the new version.
Write a new temporary file as you go along, then move it over the original version.
So, how do you choose between them? I'll try to summarize the differences, ordered so that each one typically trumps the ones below if it's important (but that's just "typically"—you have to think through your own use case):
2 doesn't require holding the entire thing in memory. If your file is, say, 20GB long, this is obviously a huge win; if it's 16KB, it doesn't matter.
2 makes the entire operation atomic. Even if it fails halfway through, or some other process tries to read the file while you're in the middle of changing it, there is no way anyone can see some invalid half-modified file; they will see either the original file, or the new one.
2 requires some free disk space (because there are, temporarily, two copies of the file at the same time).
2 is a huge pain in the neck if you care about both Windows and POSIX.
2 can involve copying across filesystems if the original file and the temp directory are on different filesystems, unless you're careful about it.
2 is simpler if neither of the above two are an issue.
Drakekin's answer tells you how to do #1.
Here's how to do #2 if you don't care about Windows or about cross-filesystem issues:
infile = open("cache.ucb", 'rb')
outfile = tempfile.NamedTemporaryFile(delete=False)
for line in infile:
if line.split('~!')[0] == ex[4]:
line += "~!" + mask[0]
line = line.rstrip() + "\n"
outfile.write(line)
infile.close()
os.rename(outfile.name, "cache.ucb")
outfile.close()
You can solve the cross-filesystem problem by, e.g., passing dir=os.path.dirname(original path) to the NamedTemporaryFile constructor, but only if you're sure you'll always have permissions to create a new file alongside the original (which isn't always guaranteed, just because you have permission to rewrite the original—UNIX permissions, Windows ACLs, the OS X sandbox, etc. all give ways that can be false).
To solve the Windows problem… well, start with Is an atomic file rename (with overwrite) possible on Windows, and similar discussions all over the internet.

Open the file in mode 'wb' and put file.write(line) at the end of your loop.

You don't have your file open for writing.
file = open("cache.ucb", 'rb')
This line opens a file for reading in binary mode. You need to open it for writing also.
Try opening the file in write mode, 'w' and writing the line back.
Or you can simply open your file for read/write at the beginning and write inside your loop:
file = open("cache.ucb", 'a+')

How do I modify the last line of a file?

The last line of my file is:
29-dez,40,
How can I modify that line so that it reads:
29-Dez,40,90,100,50
Note: I don't want to write a new line. I want to take the same line and put new values after 29-Dez,40,
I'm new at python. I'm having a lot of trouble manipulating files and for me every example I look at seems difficult.

Unless the file is huge, you'll probably find it easier to read the entire file into a data structure (which might just be a list of lines), and then modify the data structure in memory, and finally write it back to the file.
On the other hand maybe your file is really huge - multiple GBs at least. In which case: the last line is probably terminated with a new line character, if you seek to that position you can overwrite it with the new text at the end of the last line.
So perhaps:
f = open("foo.file", "wb")
f.seek(-len(os.linesep), os.SEEK_END)
f.write("new text at end of last line" + os.linesep)
f.close()
(Modulo line endings on different platforms)

To expand on what Doug said, in order to read the file contents into a data structure you can use the readlines() method of the file object.
The below code sample reads the file into a list of "lines", edits the last line, then writes it back out to the file:
#!/usr/bin/python
MYFILE="file.txt"
# read the file into a list of lines
lines = open(MYFILE, 'r').readlines()
# now edit the last line of the list of lines
new_last_line = (lines[-1].rstrip() + ",90,100,50")
lines[-1] = new_last_line
# now write the modified list back out to the file
open(MYFILE, 'w').writelines(lines)
If the file is very large then this approach will not work well, because this reads all the file lines into memory each time and writes them back out to the file, which is very inefficient. For a small file however this will work fine.

Don't work with files directly, make a data structure that fits your needs in form of a class and make read from/write to file methods.

I recently wrote a script to do something very similar to this. It would traverse a project, find all module dependencies and add any missing import statements. I won't clutter this post up with the entire script, but I'll show how I went about modifying my files.
import os
from mmap import mmap
def insert_import(filename, text):
if len(text) < 1:
return
f = open(filename, 'r+')
m = mmap(f.fileno(), os.path.getsize(filename))
origSize = m.size()
m.resize(origSize + len(text))
pos = 0
while True:
l = m.readline()
if l.startswith(('import', 'from')):
continue
else:
pos = m.tell() - len(l)
break
m[pos+len(text):] = m[pos:origSize]
m[pos:pos+len(text)] = text
m.close()
f.close()
Summary: This snippet takes a filename and a blob of text to insert. It finds the last import statement already present, and sticks the text in at that location.
The part I suggest paying most attention to is the use of mmap. It lets you work with files in the same manner you may work with a string. Very handy.

Python truncate lines as they are read

I have an application that reads lines from a file and runs its magic on each line as it is read. Once the line is read and properly processed, I would like to delete the line from the file. A backup of the removed line is already being kept. I would like to do something like
file = open('myfile.txt', 'rw+')
for line in file:
processLine(line)
file.truncate(line)
This seems like a simple problem, but I would like to do it right rather than a whole lot of complicated seek() and tell() calls.
Maybe all I really want to do is remove a particular line from a file.
After spending far to long on this problem I decided that everyone was probably right and this it just not a good way to do things. It just seemed so elegant solution. What I was looking for was something akin to a FIFO that would just let me pop lines out of a file.

Remove all lines after you've done with them:
with open('myfile.txt', 'r+') as file:
for line in file:
processLine(line)
file.truncate(0)
Remove each line independently:
lines = open('myfile.txt').readlines()
for line in lines[::-1]: # process lines in reverse order
processLine(line)
del lines[-1] # remove the [last] line
open('myfile.txt', 'w').writelines(lines)
You can leave only those lines that cause exceptions:
import fileinput, sys
for line in fileinput.input(['myfile.txt'], inplace=1):
try: processLine(line)
except Exception:
sys.stdout.write(line) # it prints to 'myfile.txt'
In general, as other people already said it is a bad idea what you are trying to do.

You can't. It is just not possible with actual text file implementations on current filesystems.
Text files are sequential, because the lines in a text file can be of any length.
Deleting a particular line would mean rewriting the entire file from that point on.
Suppose you have a file with the following 3 lines;
'line1\nline2reallybig\nline3\nlast line'
To delete the second line you'd have to move the third and fourth lines' positions in the disk. The only way would be to store the third and fourth lines somewhere, truncate the file on the second line, and rewrite the missing lines.
If you know the size of every line in the text file, you can truncate the file in any position using .truncate(line_size * line_number) but even then you'd have to rewrite everything after the line.

You're better off keeping a index into the file so that you can start where you stopped last, without destroying part of the file. Something like this would work :
try :
for index, line in enumerate(file) :
processLine(line)
except :
# Failed, start from this line number next time.
print(index)
raise

Truncating the file as you read it seems a bit extreme. What if your script has a bug that doesn't cause an error? In that case you'll want to restart at the beginning of your file.
How about having your script print the line number it breaks on and having it take a line number as a parameter so you can tell it which line to start processing from?

First of all, calling the operation truncate is probably not the best pick. If I understand the problem correctly, you want to delete everything up to the current position in file. (I would expect truncate to cut everything from the current position up to the end of the file. This is how the standard Python truncate method works, at least if I Googled correctly.)
Second, I am not sure it is wise to modify the file while iterating on in using the for loop. Wouldn’t it be better to save the number of lines processed and delete them after the main loop has finished, exception or not? The file iterator supports in-place filtering, which means it should be fairly simple to drop the processed lines afterwards.
P.S. I don’t know Python, take this with a grain of salt.

A related post has what seems a good strategy to do that, see
How can I run the first process from a list of processes stored in a file and immediately delete the first line as if the file was a queue and I called "pop"?
I have used it as follows:
import os;
tasklist_file = open(tasklist_filename, 'rw');
first_line = tasklist_file.readline();
temp = os.system("sed -i -e '1d' " + tasklist_filename); # remove first line from task file;
I'm not sure it works on Windows.
Tried it on a mac and it did do the trick.

This is what I use for file based queues. It returns the first line and rewrites the file with the rest. When it's done it returns None:
def pop_a_text_line(filename):
with open(filename,'r') as f:
S = f.readlines()
if len(S) > 0:
pop = S[0]
with open(filename,'w') as f:
f.writelines(S[1:])
else:
pop = None
return pop

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Delete a row from a text file with Python - python

Instead of writing out all lines one by one to the file, delete the line from memory (to which you read the file using readlines()) and then write the memory back to disk in one shot. That way you will get the result you want, and you won't have to clog the I/O.

You could mmap the file... after haven read the suitable documentation...

Related

How can I delete and rewrite a line without unknown characters?

Python: read a line and write back to that same line

Replacing a line on Python

How do I modify the last line of a file?

Python truncate lines as they are read

Categories

Resources