python write method append - python

I'm stuck in a very basic problem of I/O in python. I'd like to insert some line in existing file (called ofe, output file), extracted from an source file (called ife, input file) according to arguments passed by user as stored in an list called lineRange (which has an index idx and values lineNumber).
This is the result:
for ifeidx,ifeline in enumerate(ife,1): #for each line of the input file...
with open(outFile,'r+') as ofe:
for idx,lineNumber in enumerate(lineRange,1): #... check if it's present in desired list of lines...
if (ifeidx == lineNumber): #...if found...
ofeidx = 0
for ofeidx, ofeline in enumerate(ofe,1):
if (ofeidx == idx): #...just scroll the the output file and find which is the exact position in desired list...
ofe.write(ifeline) #...put the desired line in correct order. !!! This is always appending at the end of out file!!!!
break
Problem is, the write() method is always pointing to the end of file, appending the lines instead of inserting them when scrolling the output file.
I really don't understand what's happening since the file is open in read+write (r+) mode, neither append (a) nor read+append (r+a) mode, .
I'm also aware that code will (should) overwrite the output file lines. Additional information are the OS WIndow7, Python version 2.7 and development tool is Eclipse with PyDev 3.7.1.xx
Any suggestion on what I'm doing wrong?

You can start by reading the whole file with readlines(), which will return a list. After that you just need to do list.insert(index, value) and write it again back to the file.
with open(outFile, "r") as f:
data = f.readlines()
data.insert(index, value)
with open(outFile, "w+") as f:
f.write(data)
Of course you should change this approach if you are dealing with a huge file.
By the way, if you are not using the with statement you should close the file in the end.

Related

Replace string in specific line of nonstandard text file

Similar to posting: Replace string in a specific line using python, however results were not forethcomming in my slightly different instance.
I working with python 3 on windows 7. I am attempting to batch edit some files in a directory. They are basically text files with .LIC tag. I'm not sure if that is relevant to my issue here. I am able to read the file into python without issue.
My aim is to replace a specific string on a specific line in this file.
import os
import re
groupname = 'Oldtext'
aliasname = 'Newtext'
with open('filename') as f:
data = f.readlines()
data[1] = re.sub(groupname,aliasname, data[1])
f.writelines(data[1])
print(data[1])
print('done')
When running the above code I get an UnsupportedOperation: not writable. I am having some issue writing the changes back to the file. Based on suggestion of other posts, I edited added the w option to the open('filename', "w") function. This causes all text in the file to be deleted.
Based on suggestion, the r+ option was tried. This leads to successful editing of the file, however, instead of editing the correct line, the edited line is appended to the end of the file, leaving the original intact.
Writing a changed line into the middle of a text file is not going to work unless it's exactly the same length as the original - which is the case in your example, but you've got some obvious placeholder text there so I have no idea if the same is true of your actual application code. Here's an approach that doesn't make any such assumption:
with open('filename', 'r') as f:
data = f.readlines()
data[1] = re.sub(groupname,aliasname, data[1])
with open('filename', 'w') as f:
f.writelines(data)
EDIT: If you really wanted to write only the single line back into the file, you'd need to use f.tell() BEFORE reading the line, to remember its position within the file, and then f.seek() to go back to that position before writing.

Delete a line in multiple text files with the same line beginning but varying line ending using Python v3.5

I have a folder full of .GPS files, e.g. 1.GPS, 2.GPS, etc...
Within each file is the following five lines:
Trace #1 at position 0.004610
$GNGSA,A,3,02,06,12,19,24,25,,,,,,,2.2,1.0,2.0*21
$GNGSA,A,3,75,86,87,,,,,,,,,,2.2,1.0,2.0*2C
$GNVTG,39.0304,T,39.0304,M,0.029,N,0.054,K,D*32
$GNGGA,233701.00,3731.1972590,S,14544.3073733,E,4,09,1.0,514.675,M,,,0.49,3023*27
...followed by the same data structure, with different values, over the next five lines:
Trace #6 at position 0.249839
$GNGSA,A,3,02,06,12,19,24,25,,,,,,,2.2,1.0,2.0*21
$GNGSA,A,3,75,86,87,,,,,,,,,,2.2,1.0,2.0*2C
$GNVTG,247.2375,T,247.2375,M,0.081,N,0.149,K,D*3D
$GNGGA,233706.00,3731.1971997,S,14544.3075178,E,4,09,1.0,514.689,M,,,0.71,3023*2F
(I realise the values after the $GNGSA lines don't vary in the above example. This is just a bad example... in the real dataset they do vary!)
I need to remove the lines that begin with "$GNGSA" and "$GNVTG" (i.e. I need to delete lines 2, 3, and 4 from each group of five lines within each .GPS file).
This five-line pattern continues for a varying number of times throughout each file (for some files, there might only be two five-line groups, while other files might have hundreds of the five-line groups). Hence, deleting these lines based on the line number will not work (because the line number would be variable).
The problem I am having (as seen in the above examples) is that the text that follows the "$GNGSA" or "$GNVTG" varies.
I'm currently learning Python (I'm using v3.5), so figured this would make for a good project for me to learn a few new tricks...
What I've tried already:
So far, I've managed to create the code to loop through the entire folder:
import os
indir = '/Users/dhunter/GRID01/' # input directory
for i in os.listdir(indir): # for each "i" (iteration) within the indir variable directory...
if i.endswith('.GPS'): # if the filename of an iteration ends with .GPS, then...
print(i + ' loaded') # print the filename to CLI, simply for debugging purposes.
with open(indir + i, 'r') as my_file: # open the iteration file
file_lines = my_file.readlines() # uses the readlines method to create a list of all lines in the file.
print(file_lines) # this prints the entire contents of each file to CLI for debugging purposes.
Everything in the above works perfectly.
What I need help with:
How do I detect and delete the lines themselves, and then save the file (to the same location; there is no need to save to a different filename)?
The filenames - which usually end with ".GPS" - sometimes end with ".gps" instead (the only difference being the case). My above code will only work with the uppercase files. Besides completely duplicating the code and changing the endswith argument, how do I make it work with both cases?
In the end, my file needs to look something like this:
Trace #1 at position 0.004610
$GNGGA,233701.00,3731.1972590,S,14544.3073733,E,4,09,1.0,514.675,M,,,0.49,3023*27
Trace #6 at position 0.249839
$GNGGA,233706.00,3731.1971997,S,14544.3075178,E,4,09,1.0,514.689,M,,,0.71,3023*2F
Any suggestions, please? Thanks in advance. :)
You're almost there.
import os
indir = '/Users/dhunter/GRID01/' # input directory
for i in os.listdir(indir): # for each "i" (iteration) within the indir variable directory...
if i.endswith('.GPS'): # if the filename of an iteration ends with .GPS, then...
print(i + ' loaded') # print the filename to CLI, simply for debugging purposes.
with open(indir + i, 'r') as my_file: # open the iteration file
for line in my_file:
if not line.startswith('$GNGSA') and not line.startswith('$GNVTG'):
print(line)
As per what the others have said, you're on the right track! Where you're going wrong is in the case-sensitive file extension check, and in reading in the entire file contents at once (this isn't per se wrong, but it's probably adding complexity we won't need).
I've commented your code, removing all the debug stuff for simplicity, to illustrate what I mean:
import os
indir = '/path/to/files'
for i in os.listdir(indir):
if i.endswith('.GPS'): #This CASE SENSITIVELY checks the file extension
with open(indir + i, 'r') as my_file: # Opens the file
file_lines = my_file.readlines() # This reads the ENTIRE file at once into an array of lines
So we need to fix the case sensitivity issue, and instead of reading in all the lines, we'll instead read the file line-by-line, check each line to see if we want to discard it or not, and write the lines we're interested in into an output file.
So, incorporating #tdelaney's case-insensitive fix for file name, we replace line #5 with
if i.lower().endswith('.gps'): # Case-insensitively check the file name
and instead of reading in the entire file at once, we'll instead iterate over the file stream and print each desired line out
with open(indir + i) as in_file, open(indir + i + 'new.gps') as out_file: # Open the input file for reading and creates + opens a new output file for writing - thanks #tdelaney once again!
for line in in_file # This reads each line one-by-one from the in file
if not line.startswith('$GNGSA') and not line.startswith('$GNVTG'): # Check the line has what we want (thanks Avinash)
out_file.write(line + "\n") # Write the line to the new output file
Note that you should make certain that you open the output file OUTSIDE of the 'for line in in_file' loop, or else the file will be overwritten on every iteration which will erase what you've already written to it so far (I suspect this is the issue you've had with the previous answers). Open both files at the same time and you can't go wrong.
Alternatively, you can specify the file access mode when you open the file, as per
with open(indir + i + 'new.gps', 'a'):
which will open the file in append-mode, which is a specialised from of write-mode that preserves the original contents of the file, and appends new data to it instead of overwriting existing data.
Ok, based on suggestions by Avinash Raj, tdelaney, and Sampson Oliver, here on Stack Overflow, and another friend who helped privately, here is the solution that is now working:
import os
indir = '/Users/dhunter/GRID01/' # input directory
for i in os.listdir(indir): # for each "i" (iteration) within the indir variable directory...
if i.lower().endswith('.gps'): # if the filename of an iteration ends with .GPS, then...
if not i.lower().endswith('.gpsnew.gps'): # if the filename does not end with .gpsnew.gps, then...
print(i + ' loaded') # print the filename to CLI.
with open (indir + i, 'r') as my_file:
for line in my_file:
if not line.startswith('$GNGSA'):
if not line.startswith('$GNVTG'):
with open(indir + i + 'new.gps', 'a') as outputfile:
outputfile.write(line)
outputfile.write('\r\n')
(You'll see I had to add in another layer of if statement to stop it from using the output files from previous uses of the script "if not i.lower().endswith('.gpsnew.gps'):", but this line can easily be deleted for anyone who uses these instructions in future)
We switched the open mode on the third-last line to "a" for append, so that it would save all the right lines to the file, rather than overwriting each time.
We also added in the final line to add a line break at the end of each line.
Thanks everyone for their help, explanations, and suggestions. Hopefully this solution will be useful to someone in future. :)
2. The filenames:
The if accepts any expression returning a truth value, and you can combine expressions with the standart boolean operators: if i.endswith('.GPS') or i.endswith('.gps').
You can also put the ... and ... expression after the if in brackets, to feel more sure, but it's not neccessary.
Alternatively, as a less universal solution, (but since you wanted to learn a few tricks :)) you can use string manipulation in this case: an object of type string has a lot of methods. '.gps'.upper() gives '.GPS' -- try, if you can make use of this! (even a printed string is a string object, but your variables behave the same).
1. Finding the Lines:
As you can see in the other solution, you need not read out all of your lines, you can check if want to have them 'on the fly'. But I will stick to your approach with readlines. It gives you a list, and lists support indexing and slicing. Try:
anylist[stratindex, endindex, stride], for any values, so for example try: newlist = range(100)[1::5].
It's always helpfull to try out the easy basic operations in interactive mode, or at the beginning of your script. Here range(100) is just some sample list. Here you see, how the python for-syntax works, differently than in other languages: you can iterate over any list, and if you just need integers, you create a list with integers with range().
So this will work the same with any other list -- e.g. the one you get from readlines()
This selects a slice from the list, beginnig with the second element, ending at the end (since the end index is omitted), and taking every 5th element. Now you have this sub-list, you can just revome it from the original. So for the example with the range:
a = range(100)
del(a[1::5])
print a
So you see, that the appropriate items have been removed. Now do the same with your file_lines, and then proceed to remove the other lines you want to remove.
Then, in a new with block, open the file for writing and do writelines(file_lines), so the remainig lines are written back to the file.
Of course you can also take the approach to look for the content of each line with a for loop over your list and startswith(). Or you can combine the approaches, and check, if deleting lines by number leaves the right starts, so you can print an error if something is unexpected...
3. Saving the file
You can close your file after you have the lines saved in the readlines(). In fact this is done automatically at the end of the with-block. Then just open it in 'w' mode instead of 'r' and do yourfilename.writelines(yourlist). You don't need to save, it's saven on closing.

Python: read a line and write back to that same line

I am using python to make a template updater for html. I read a line and compare it with the template file to see if there are any changes that needs to be updated. Then I want to write any changes (if there are any) back to the same line I just read from.
Reading the file, my file pointer is positioned now on the next line after a readline(). Is there anyway I can write back to the same line without having to open two file handles for reading and writing?
Here is a code snippet of what I want to do:
cLine = fp.readline()
if cLine != templateLine:
# Here is where I would like to write back to the line I read from
# in cLine
Updating lines in place in text file - very difficult
Many questions in SO are trying to read the file and update it at once.
While this is technically possible, it is very difficult.
(text) files are not organized on disk by lines, but by bytes.
The problem is, that read number of bytes on old lines is very often different from new one, and this mess up the resulting file.
Update by creating a new file
While it sounds inefficient, it is the most effective way from programming point of view.
Just read from file on one side, write to another file on the other side, close the files and copy the content from newly created over the old one.
Or create the file in memory and finally do the writing over the old one after you close the old one.
At the OS level the things are a bit different from how it looks from Python - from Python a file looks almost like a list of strings, with each string having arbitrary length, so it seems to be easy to swap a line for something else without affecting the rest of the lines:
l = ["Hello", "world"]
l[0] = "Good bye"
In reality, though, any file is just a stream of bytes, with strings following each other without any "padding". So you can only overwrite the data in-place if the resulting string has exactly the same length as the source string - otherwise it'll simply overwrite the following lines.
If that is the case (your processing guarantees not to change the length of strings), you can "rewind" the file to the start of the line and overwrite the line with new data. The below script converts all lines in file to uppercase in-place:
def eof(f):
cur_loc = f.tell()
f.seek(0,2)
eof_loc = f.tell()
f.seek(cur_loc, 0)
if cur_loc >= eof_loc:
return True
return False
with open('testfile.txt', 'r+t') as fp:
while True:
last_pos = fp.tell()
line = fp.readline()
new_line = line.upper()
fp.seek(last_pos)
fp.write(new_line)
print "Read %s, Wrote %s" % (line, new_line)
if eof(fp):
break
Somewhat related: Undo a Python file readline() operation so file pointer is back in original state
This approach is only justified when your output lines are guaranteed to have the same length, and when, say, the file you're working with is really huge so you have to modify it in place.
In all other cases it would be much easier and more performant to just build the output in memory and write it back at once. Another option is to write to a temporary file, then delete the original and rename the temporary file so it replaces the original file.

How to erase the file contents of text file in Python?

I have text file which I want to erase in Python. How do I do that?
In python:
open('file.txt', 'w').close()
Or alternatively, if you have already an opened file:
f = open('file.txt', 'r+')
f.truncate(0) # need '0' when using r+
Opening a file in "write" mode clears it, you don't specifically have to write to it:
open("filename", "w").close()
(you should close it as the timing of when the file gets closed automatically may be implementation specific)
Not a complete answer more of an extension to ondra's answer
When using truncate() ( my preferred method ) make sure your cursor is at the required position.
When a new file is opened for reading - open('FILE_NAME','r') it's cursor is at 0 by default.
But if you have parsed the file within your code, make sure to point at the beginning of the file again i.e truncate(0)
By default truncate() truncates the contents of a file starting from the current cusror position.
A simple example
As #jamylak suggested, a good alternative that includes the benefits of context managers is:
with open('filename.txt', 'w'):
pass
When using with open("myfile.txt", "r+") as my_file:, I get strange zeros in myfile.txt, especially since I am reading the file first. For it to work, I had to first change the pointer of my_file to the beginning of the file with my_file.seek(0). Then I could do my_file.truncate() to clear the file.
Writing and Reading file content
def writeTempFile(text = None):
filePath = "/temp/file1.txt"
if not text: # If not provided return file content
f = open(filePath, "r")
slug = f.read()
return slug
else:
f = open(filePath, "a") # Create a blank file
f.seek(0) # sets point at the beginning of the file
f.truncate() # Clear previous content
f.write(text) # Write file
f.close() # Close file
return text
It Worked for me
If security is important to you then opening the file for writing and closing it again will not be enough. At least some of the information will still be on the storage device and could be found, for example, by using a disc recovery utility.
Suppose, for example, the file you're erasing contains production passwords and needs to be deleted immediately after the present operation is complete.
Zero-filling the file once you've finished using it helps ensure the sensitive information is destroyed.
On a recent project we used the following code, which works well for small text files. It overwrites the existing contents with lines of zeros.
import os
def destroy_password_file(password_filename):
with open(password_filename) as password_file:
text = password_file.read()
lentext = len(text)
zero_fill_line_length = 40
zero_fill = ['0' * zero_fill_line_length
for _
in range(lentext // zero_fill_line_length + 1)]
zero_fill = os.linesep.join(zero_fill)
with open(password_filename, 'w') as password_file:
password_file.write(zero_fill)
Note that zero-filling will not guarantee your security. If you're really concerned, you'd be best to zero-fill and use a specialist utility like File Shredder or CCleaner to wipe clean the 'empty' space on your drive.
You have to overwrite the file. In C++:
#include <fstream>
std::ofstream("test.txt", std::ios::out).close();
You can also use this (based on a few of the above answers):
file = open('filename.txt', 'w')
file.close()
of course this is a really bad way to clear a file because it requires so many lines of code, but I just wrote this to show you that it can be done in this method too.
happy coding!
You cannot "erase" from a file in-place unless you need to erase the end. Either be content with an overwrite of an "empty" value, or read the parts of the file you care about and write it to another file.
Assigning the file pointer to null inside your program will just get rid of that reference to the file. The file's still there. I think the remove() function in the c stdio.h is what you're looking for there. Not sure about Python.
Since text files are sequential, you can't directly erase data on them. Your options are:
The most common way is to create a new file. Read from the original file and write everything on the new file, except the part you want to erase. When all the file has been written, delete the old file and rename the new file so it has the original name.
You can also truncate and rewrite the entire file from the point you want to change onwards. Seek to point you want to change, and read the rest of file to memory. Seek back to the same point, truncate the file, and write back the contents without the part you want to erase.
Another simple option is to overwrite the data with another data of same length. For that, seek to the exact position and write the new data. The limitation is that it must have exact same length.
Look at the seek/truncate function/method to implement any of the ideas above. Both Python and C have those functions.
This is my method:
open the file using r+ mode
read current data from the file using file.read()
move the pointer to the first line using file.seek(0)
remove old data from the file using file.truncate(0)
write new content and then content that we saved using file.read()
So full code will look like this:
with open(file_name, 'r+') as file:
old_data = file.read()
file.seek(0)
file.truncate(0)
file.write('my new content\n')
file.write(old_data)
Because we are using with open, file will automatically close.

How do I modify the last line of a file?

The last line of my file is:
29-dez,40,
How can I modify that line so that it reads:
29-Dez,40,90,100,50
Note: I don't want to write a new line. I want to take the same line and put new values after 29-Dez,40,
I'm new at python. I'm having a lot of trouble manipulating files and for me every example I look at seems difficult.
Unless the file is huge, you'll probably find it easier to read the entire file into a data structure (which might just be a list of lines), and then modify the data structure in memory, and finally write it back to the file.
On the other hand maybe your file is really huge - multiple GBs at least. In which case: the last line is probably terminated with a new line character, if you seek to that position you can overwrite it with the new text at the end of the last line.
So perhaps:
f = open("foo.file", "wb")
f.seek(-len(os.linesep), os.SEEK_END)
f.write("new text at end of last line" + os.linesep)
f.close()
(Modulo line endings on different platforms)
To expand on what Doug said, in order to read the file contents into a data structure you can use the readlines() method of the file object.
The below code sample reads the file into a list of "lines", edits the last line, then writes it back out to the file:
#!/usr/bin/python
MYFILE="file.txt"
# read the file into a list of lines
lines = open(MYFILE, 'r').readlines()
# now edit the last line of the list of lines
new_last_line = (lines[-1].rstrip() + ",90,100,50")
lines[-1] = new_last_line
# now write the modified list back out to the file
open(MYFILE, 'w').writelines(lines)
If the file is very large then this approach will not work well, because this reads all the file lines into memory each time and writes them back out to the file, which is very inefficient. For a small file however this will work fine.
Don't work with files directly, make a data structure that fits your needs in form of a class and make read from/write to file methods.
I recently wrote a script to do something very similar to this. It would traverse a project, find all module dependencies and add any missing import statements. I won't clutter this post up with the entire script, but I'll show how I went about modifying my files.
import os
from mmap import mmap
def insert_import(filename, text):
if len(text) < 1:
return
f = open(filename, 'r+')
m = mmap(f.fileno(), os.path.getsize(filename))
origSize = m.size()
m.resize(origSize + len(text))
pos = 0
while True:
l = m.readline()
if l.startswith(('import', 'from')):
continue
else:
pos = m.tell() - len(l)
break
m[pos+len(text):] = m[pos:origSize]
m[pos:pos+len(text)] = text
m.close()
f.close()
Summary: This snippet takes a filename and a blob of text to insert. It finds the last import statement already present, and sticks the text in at that location.
The part I suggest paying most attention to is the use of mmap. It lets you work with files in the same manner you may work with a string. Very handy.

Categories