I am trying to copy the content of one file to another.
The script successfully copies content to the file but when I try to run READ command with the output file to print the output, it is blank.
from sys import argv
script, inputFile, outputFile = argv
inFile = open(inputFile)
inData = inFile.read()
outFile = open(outputFile, 'w+')
outFile.write(inData)
print("The new data is:\n",outFile.read())
inFile.close()
outFile.close()
After the write operation the file pointer is at the end of file so you'd need to reset it to the start. Also, the filesystem IO buffers may not have been flushed at that point (you haven't closed the file yet)...
Simple solution: close the outFile and reopen it for reading.
As a side note: always make sure you DO close your files whatever happens, specially when writing, else you may end up with corrupted data. The simplest way is the with statement:
with open(...) as infile, (...) as outfile:
outfile.write(infile.read())
# at this point both files have been automagically closed
You forgot to return to the beginning of outFile after writing to it.
So inserting outFile.seek(0) should fix your issues.
After you are done writing, file pointer is at the end of the file, so no data is there. Reposition pointer to start of the file.
Related
Started Python a week ago and I have some questions to ask about reading and writing to the same files. I've gone through some tutorials online but I am still confused about it. I can understand simple read and write files.
openFile = open("filepath", "r")
readFile = openFile.read()
print readFile
openFile = open("filepath", "a")
appendFile = openFile.write("\nTest 123")
openFile.close()
But, if I try the following I get a bunch of unknown text in the text file I am writing to. Can anyone explain why I am getting such errors and why I cannot use the same openFile object the way shown below.
# I get an error when I use the codes below:
openFile = open("filepath", "r+")
writeFile = openFile.write("Test abc")
readFile = openFile.read()
print readFile
openFile.close()
I will try to clarify my problems. In the example above, openFile is the object used to open file. I have no problems if I want write to it the first time. If I want to use the same openFile to read files or append something to it. It doesn't happen or an error is given. I have to declare the same/different open file object before I can perform another read/write action to the same file.
#I have no problems if I do this:
openFile = open("filepath", "r+")
writeFile = openFile.write("Test abc")
openFile2 = open("filepath", "r+")
readFile = openFile2.read()
print readFile
openFile.close()
I will be grateful if anyone can tell me what I did wrong here or is it just a Pythong thing. I am using Python 2.7. Thanks!
Updated Response:
This seems like a bug specific to Windows - http://bugs.python.org/issue1521491.
Quoting from the workaround explained at http://mail.python.org/pipermail/python-bugs-list/2005-August/029886.html
the effect of mixing reads with writes on a file open for update is
entirely undefined unless a file-positioning operation occurs between
them (for example, a seek()). I can't guess what
you expect to happen, but seems most likely that what you
intend could be obtained reliably by inserting
fp.seek(fp.tell())
between read() and your write().
My original response demonstrates how reading/writing on the same file opened for appending works. It is apparently not true if you are using Windows.
Original Response:
In 'r+' mode, using write method will write the string object to the file based on where the pointer is. In your case, it will append the string "Test abc" to the start of the file. See an example below:
>>> f=open("a","r+")
>>> f.read()
'Test abc\nfasdfafasdfa\nsdfgsd\n'
>>> f.write("foooooooooooooo")
>>> f.close()
>>> f=open("a","r+")
>>> f.read()
'Test abc\nfasdfafasdfa\nsdfgsd\nfoooooooooooooo'
The string "foooooooooooooo" got appended at the end of the file since the pointer was already at the end of the file.
Are you on a system that differentiates between binary and text files? You might want to use 'rb+' as a mode in that case.
Append 'b' to the mode to open the file in binary mode, on systems
that differentiate between binary and text files; on systems that
don’t have this distinction, adding the 'b' has no effect.
http://docs.python.org/2/library/functions.html#open
Every open file has an implicit pointer which indicates where data will be read and written. Normally this defaults to the start of the file, but if you use a mode of a (append) then it defaults to the end of the file. It's also worth noting that the w mode will truncate your file (i.e. delete all the contents) even if you add + to the mode.
Whenever you read or write N characters, the read/write pointer will move forward that amount within the file. I find it helps to think of this like an old cassette tape, if you remember those. So, if you executed the following code:
fd = open("testfile.txt", "w+")
fd.write("This is a test file.\n")
fd.close()
fd = open("testfile.txt", "r+")
print fd.read(4)
fd.write(" IS")
fd.close()
... It should end up printing This and then leaving the file content as This IS a test file.. This is because the initial read(4) returns the first 4 characters of the file, because the pointer is at the start of the file. It leaves the pointer at the space character just after This, so the following write(" IS") overwrites the next three characters with a space (the same as is already there) followed by IS, replacing the existing is.
You can use the seek() method of the file to jump to a specific point. After the example above, if you executed the following:
fd = open("testfile.txt", "r+")
fd.seek(10)
fd.write("TEST")
fd.close()
... Then you'll find that the file now contains This IS a TEST file..
All this applies on Unix systems, and you can test those examples to make sure. However, I've had problems mixing read() and write() on Windows systems. For example, when I execute that first example on my Windows machine then it correctly prints This, but when I check the file afterwards the write() has been completely ignored. However, the second example (using seek()) seems to work fine on Windows.
In summary, if you want to read/write from the middle of a file in Windows I'd suggest always using an explicit seek() instead of relying on the position of the read/write pointer. If you're doing only reads or only writes then it's pretty safe.
One final point - if you're specifying paths on Windows as literal strings, remember to escape your backslashes:
fd = open("C:\\Users\\johndoe\\Desktop\\testfile.txt", "r+")
Or you can use raw strings by putting an r at the start:
fd = open(r"C:\Users\johndoe\Desktop\testfile.txt", "r+")
Or the most portable option is to use os.path.join():
fd = open(os.path.join("C:\\", "Users", "johndoe", "Desktop", "testfile.txt"), "r+")
You can find more information about file IO in the official Python docs.
Reading and Writing happens where the current file pointer is and it advances with each read/write.
In your particular case, writing to the openFile, causes the file-pointer to point to the end of file. Trying to read from the end would result EOF.
You need to reset the file pointer, to point to the beginning of the file before through seek(0) before reading from it
You can read, modify and save to the same file in python but you have actually to replace the whole content in file, and to call before updating file content:
# set the pointer to the beginning of the file in order to rewrite the content
edit_file.seek(0)
I needed a function to go through all subdirectories of folder and edit content of the files based on some criteria, if it helps:
new_file_content = ""
for directories, subdirectories, files in os.walk(folder_path):
for file_name in files:
file_path = os.path.join(directories, file_name)
# open file for reading and writing
with io.open(file_path, "r+", encoding="utf-8") as edit_file:
for current_line in edit_file:
if condition in current_line:
# update current line
current_line = current_line.replace('john', 'jack')
new_file_content += current_line
# set the pointer to the beginning of the file in order to rewrite the content
edit_file.seek(0)
# delete actual file content
edit_file.truncate()
# rewrite updated file content
edit_file.write(new_file_content)
# empties new content in order to set for next iteration
new_file_content = ""
edit_file.close()
I have this code
with codecs.open("file.json", mode='a+', encoding='utf-8') as f:
I want:
1) Create file if it does not exists, and start writing from the start of file.
2) If exists, first read it and truncate it and then write something.
I found this somewhere
``r'' Open text file for reading. The stream is positioned at the
beginning of the file.
``r+'' Open for reading and writing. The stream is positioned at the
beginning of the file.
``w'' Truncate file to zero length or create text file for writing.
The stream is positioned at the beginning of the file.
``w+'' Open for reading and writing. The file is created if it does not
exist, otherwise it is truncated. The stream is positioned at
the beginning of the file.
``a'' Open for writing. The file is created if it does not exist. The
stream is positioned at the end of the file. Subsequent writes
to the file will always end up at the then current end of file,
irrespective of any intervening fseek(3) or similar.
``a+'' Open for reading and writing. The file is created if it does not
exist. The stream is positioned at the end of the file. Subse-
quent writes to the file will always end up at the then current
end of file, irrespective of any intervening fseek(3) or similar.
a+ mode suits me best but what it does that it only lets me write at end of file,
With a+ mode I have this f.seek(0) immediately after opening file, but it has no affect, it does not seek to the start of file.
Let's say you have a file a with content:
first line
second line
third line
If you need to write from the start of the file, just do:
with open('a','r+') as f:
f.write("forth line")
Output:
forth line
second line
third line
If you need to remove the current content and write from the start, do:
with open('a','r+') as f:
f.write("forth line")
f.truncate()
Output:
forth line
If you need to append after the existing file, do:
with open('a','a') as f:
f.write("forth line")
Output:
first line
second line
third line
forth line
And, as you suspected, you will not be able to seek to 0 in a+ mode. You might see details from here
Edit:
Yes, you can dump json with this configuration and still indent. Demo:
dic = {'a':1,"b":2}
import json
with open('a','r+') as f:
json.dump(dic,f, indent=2)
Output:
{
"a": 1,
"b": 2
}third line
Use os.path.isfile():
import os
if os.path.isfile(filename):
# do stuff
else:
# do other stuff
As to your second question about writing to the begging of a file, then don't use a+. See here for how to prepend to a file. I'll post the relevant bits here:
# credit goes to #eyquem. Not my code
def line_prepender(filename, line):
with open(filename, 'r+') as f:
content = f.read()
f.seek(0, 0)
f.write(line.rstrip('\r\n') + '\n' + content)
You can check if the file exists, and then branch accordingly like so:
import os.path
file_exists = os.path.isfile(filename)
if file_exists:
# do something
else:
# do something else
Hope this helps!
You can open the file using os.open to be able to seek and have more control, but you won't be able to use codecs.open or a context manager then, so it's a bit more manual labor:
import os
f = os.fdopen(os.open(filename, os.O_RDWR | os.O_CREAT), 'r+')
try:
content = f.read()
f.seek(0)
f.truncate()
f.write("Your new data")
finally:
f.close()
In python, there are a few flags you can supply when opening a file for operation. I am a bit baffled at finding a combination that allow me to do random write without truncating. The behavior I am looking for is equivalent to C: create it if it doesn't exist, otherwise, open for write (not truncating)
open(filename, O_WRONLY|O_CREAT)
Python's document is confusing (to me): "w" will truncate the file first, "+" is supposed to mean updating, but "w+" will truncate it anyway. Is there anyway to achieve this without resorting to the low-level os.open() interface?
Note: the "a" or "a+" doesn't work either (please correct if I am doing something wrong here)
cat test.txt
eee
with open("test.txt", "a+") as f:
f.seek(0)
f.write("a")
cat test.txt
eeea
Is that so the append mode insist on writing to the end?
You can do it with os.open:
import os
f = os.fdopen(os.open(filename, os.O_RDWR | os.O_CREAT), 'rb+')
Now you can read, write in the middle of the file, seek, and so on. And it creates the file. Tested on Python 2 and 3.
You should try reading the file then open writing mode, as seen here:
with open("file.txt") as reading:
r = reading.read()
with open("file.txt", "w") as writing:
writing.write(r)
According to the discussion Difference between modes a, a+, w, w+, and r+ in built-in open function, the open with a mode will always write to the end of file irrespective of any intervening fseek(3) or similar.
If you only want to use python built-in function. I guess the solution is to first check if the file exist, and then open with r+ mode.
For Example:
import os
filepath = "test.txt"
if not os.path.isfile(filepath):
f = open(filepath, "x") # open for exclusive creation, failing if the file already exists
f.close()
with open(filepath, "r+") as f: # random read and write
f.seek(1)
f.write("a")
You need to use "a" to append, it will create the file if it does not exist or append to it if it does.
You cannot do what you want with append as the pointer automatically moves to the end of the file when you call the write method.
You could check if the file exists then use fileinput.input with inplace=True inserting a line on whichever line number you want.
import fileinput
import os
def random_write(f, rnd_n, line):
if not os.path.isfile(f):
with open(f, "w") as f:
f.write(line)
else:
for ind, line in enumerate(fileinput.input(f, inplace=True)):
if ind == rnd_n:
print("{}\n".format(line) + line, end="")
else:
print(line, end="")
http://linux.die.net/man/3/fopen
a+
Open for reading and appending (writing at end of file). The file is created if it does not exist. The initial file position for reading is at the beginning of the file, but output is always appended to the end of the file.
fileinput makes a f.bak copy of the file you pass in and it is deleted when the output is closed. If you specify a backup extension backup=."foo" the backup file will be kept.
with open('pf_d.txt', 'w+') as outputfile:
rc = subprocess.call([pf, 'disable'], shell=True, stdout=outputfile, stderr=outputfile)
print outputfile.readlines()
output.readlines() is returning [] even though the file is written with some data. Something is wrong here.
looks like subprocess.call() is not blocking and the file is being written after the read function. How do i solve this?
The with open('pf_d.txt', 'w+') as outputfile: construct is called context manager. In this case, the resource is a file represented by the handle/file object outputfile. The context manager makes sure that the file is closed when the context is left. Closing implicates flushing, and re-opening the file after that will show you all its contents. So, one option to solve your issue is to read your file after it has been closed:
with open('pf_d.txt', 'w+') as outputfile:
rc = subprocess.call(...)
with open('pf_d.txt', 'r') as outputfile:
print outputfile.readlines()
Another option is to re-use the same file object, after flushing and seeking:
with open('pf_d.txt', 'w+') as outputfile:
rc = subprocess.call(...)
outputfile.flush()
outputfile.seek(0)
print outputfile.readlines()
A file handle is always represented by a file pointer, indicating the current position in the file. write() forwards this pointer to the end of the file. seek(0) moves it back to the beginning, so that a subsequent read() startes from the beginning of the file.
I am bassicly trying to read a number from a file, convert it to an int, add one to it, then rewrite the new number back to the file. However every time I run this code when i open the .txt file it is blank. Any help would be appreciated thanks! I am a python newb.
f=open('commentcount.txt','r')
counts = f.readline()
f.close
counts1 = int(counts)
counts1 = counts1 + 1
print(counts1)
f2 = open('commentcount.txt','w') <---(the file overwriting seems to happen here?)
f2.write(str(counts1))
Having empty files
This issue is caused by you failing to close the file descriptor. You have f.close but it should be f.close() (a function call). And you also need an f2.close() in the end.
Without the close it takes a while until the contents of the buffer arrive in the file. And it is a good practice to close file descriptors as soon as they are not used.
As a side note, you can use the following syntactic sugar to ensure that the file descriptor is closed as soon as possible:
with open(file, mode) as f:
do_something_with(f)
Now, regarding the overwriting part:
Writing to file without overwriting the previous content.
Short answer: You don't open the file in the proper mode. Use the append mode ("a").
Long answer:
It is the intended behavior. Read the following:
>>> help(open)
Help on built-in function open in module __builtin__:
open(...)
open(name[, mode[, buffering]]) -> file object
Open a file using the file() type, returns a file object. This is the
preferred way to open a file. See file.__doc__ for further information.
>>> print file.__doc__
file(name[, mode[, buffering]]) -> file object
Open a file. The mode can be 'r', 'w' or 'a' for reading (default),
writing or appending. The file will be created if it doesn't exist
when opened for writing or appending; it will be truncated when
opened for writing. Add a 'b' to the mode for binary files.
Add a '+' to the mode to allow simultaneous reading and writing.
If the buffering argument is given, 0 means unbuffered, 1 means line
buffered, and larger numbers specify the buffer size. The preferred way
to open a file is with the builtin open() function.
Add a 'U' to mode to open the file for input with universal newline
support. Any line ending in the input file will be seen as a '\n'
in Python. Also, a file so opened gains the attribute 'newlines';
the value for this attribute is one of None (no newline read yet),
'\r', '\n', '\r\n' or a tuple containing all the newline types seen.
So, reading the manuals shows that if you want the content to be kept you should open in append mode:
open(file, "a")
you should use the with statement. this assume that the file descriptor is closed no matter what:
with open('file', 'r') as fd:
value = int(fd.read())
with open('file', 'w') as fd:
fd.write(value + 1)
You never close the file. If you don't properly close the file the OS might not commit any changes. To avoid this problem it is recommended that you use Python's with statement to open files as it it will close them for you once you are done with the file.
with open('my_file.txt', a) as f:
do_stuff()
python open file paramters:
w:
Opens a file for writing only. Overwrites the file if the file exists.
If the file does not exist, creates a new file for writing.
You can use a (append):
Opens a file for appending. The file pointer is at the end of the file
if the file exists. That is, the file is in the append mode. If the
file does not exist, it creates a new file for writing.
for more information you can read here
One more advice is to use with:
with open("x.txt","a") as f:
data = f.read()
............
For example:
with open('c:\commentcount.txt','r') as fp:
counts = fp.readline()
counts = str(int(counts) + 1)
with open('c:\commentcount.txt','w') as fp:
fp.write(counts)
Note this will work only if you have a file name commentcount and it has a int at the first line since r does not create new file, also it will be only one counter...it won't append a new number.