python file's content disappear after reading? [duplicate] - python

This question already has answers here:
Why can't I call read() twice on an open file?
(7 answers)
Closed 7 months ago.
I have a problem with iterating on a file. Here's what I type on the interpreter and the result:
>>> f = open('baby1990.html', 'rU')
>>> for line in f.readlines():
... print(line)
...
# ... all the lines from the file appear here ...
When I try to iterate on the same open file again I get nothing!
>>> for line in f.readlines():
... print(line)
...
>>>
There is no output at all. To solve this I have to close() the file then open it again for reading! Is that normal behavior?

Yes, that is normal behavior. You basically read to the end of the file the first time (you can sort of picture it as reading a tape), so you can't read any more from it unless you reset it, by either using f.seek(0) to reposition to the start of the file, or to close it and then open it again which will start from the beginning of the file.
If you prefer you can use the with syntax instead which will automatically close the file for you.
e.g.,
with open('baby1990.html', 'rU') as f:
for line in f:
print line
once this block is finished executing, the file is automatically closed for you, so you could execute this block repeatedly without explicitly closing the file yourself and read the file this way over again.

As the file object reads the file, it uses a pointer to keep track of where it is. If you read part of the file, then go back to it later it will pick up where you left off. If you read the whole file, and go back to the same file object, it will be like reading an empty file because the pointer is at the end of the file and there is nothing left to read. You can use file.tell() to see where in the file the pointer is and file.seek to set the pointer. For example:
>>> file = open('myfile.txt')
>>> file.tell()
0
>>> file.readline()
'one\n'
>>> file.tell()
4L
>>> file.readline()
'2\n'
>>> file.tell()
6L
>>> file.seek(4)
>>> file.readline()
'2\n'
Also, you should know that file.readlines() reads the whole file and stores it as a list. That's useful to know because you can replace:
for line in file.readlines():
#do stuff
file.seek(0)
for line in file.readlines():
#do more stuff
with:
lines = file.readlines()
for each_line in lines:
#do stuff
for each_line in lines:
#do more stuff
You can also iterate over a file, one line at a time, without holding the whole file in memory (this can be very useful for very large files) by doing:
for line in file:
#do stuff

The file object is a buffer. When you read from the buffer, that portion that you read is consumed (the read position is shifted forward). When you read through the entire file, the read position is at the end of the file (EOF), so it returns nothing because there is nothing left to read.
If you have to reset the read position on a file object for some reason, you can do:
f.seek(0)

Of course.
That is normal and sane behaviour.
Instead of closing and re-opening, you could rewind the file.

Related

i write on a file and read it but it seems when i run again

I want to write a file that says hello guys how are you but each word must be an item of list. Here is my code. It shows nothing when I run it, when I run second time it shows item by item as I want. But when I click text file, it is written two times.
with open('stavanger.txt','r+') as f: # file closes itself with with open as filename command
words = ['hello\n','guys\n','how\n', 'are\n','you\n']
f.writelines(words)
for i in f:
x=i.rstrip().split(',')#turn text file into list and we seperate list items by comma .
print(x)
The problem is that writing to a file uses a buffer. So after the line f.writelines(words) nothing really happened. Only the buffer changed.
In effect, the file still haven't changed and the file pointer is still at the beginning of the file. So the second time you run your code you see the content printed, which leaves the file pointer at the end of the file and only then the buffer is passed to the file and you have the duplicated content.
Simply use mode='w' if you just want to write to a file...
You start reading the file from where the writing stopped. It is better to open the file first for writing, then for reading
Something like this
with open('stavanger.txt', 'w') as f: # file closes itself with with open as filename command
words = ['hello\n', 'guys\n', 'how\n', 'are\n', 'you\n']
f.writelines(words)
with open('stavanger.txt', 'r') as f:
for i in f:
x = i.rstrip().split(',') # turn text file into list and we seperate list items by comma .
print(x)

Checking if a text file has another line Python

I'm working on a script to parse text files into a spreadsheet for myself, and in doing so I need to read through them. The issue is finding out when to stop. Java has a method attached when reading called hasNext() or hasNextLine() I was wondering if there was something like that in Python? For some reason I can't find this anywhere.
Ex:
open(f) as file:
file.readline()
nextLine = true
while nextLine:
file.readline()
Do stuff
if not file.hasNextLine():
nextLine = false
Just use a for loop to iterate over the file object:
for line in file:
#do stuff..
Note that this includes the new line char (\n) at the end of each line string. This can be removed through either:
for line in file:
line = line[:-1]
#do stuff...
or:
for line in (l[:-1] for l in file):
#do stuff...
You can only check if the file has another line by reading it (although you can check if you are at the end of the file with file.tell without any reading).
This can be done through calling file.readline and checking if the string is not empty or timgeb's method of calling next and catching the StopIteration exception.
So to answer your question exactly, you can check whether a file has another line through:
next_line = file.readline():
if next_line:
#has next line, do whatever...
or, without modifying the current file pointer:
def has_another_line(file):
cur_pos = file.tell()
does_it = bool(file.readline())
file.seek(cur_pos)
return does_it
which resets the file pointer resetting the file object back to its original state.
e.g.
$ printf "hello\nthere\nwhat\nis\nup\n" > f.txt
$ python -q
>>> f = open('f.txt')
>>> def has_another_line(file):
... cur_pos = file.tell()
... does_it = bool(file.readline())
... file.seek(cur_pos)
... return does_it
...
>>> has_another_line(f)
True
>>> f.readline()
'hello\n'
The typical cadence that I use for reading text files is this:
with open('myfile.txt', 'r') as myfile:
lines = myfile.readlines()
for line in lines:
if 'this' in line: #Your criteria here to skip lines
continue
#Do something here
Using with will only keep the file open until you have executed all of the code within it's block, then the file will be closed. I also think it's valuable to highlight the readlines() method here, which reads all lines in the file and stores them in a list. In terms of handling newline (\n) characters, I would point you to #Joe Iddon's answer.
Python doesn't have an end-of-file (EOF) indicator, but you could get the same effect this way:
with open(f) as file:
file.seek(0, 2) # go to end of file
eof = file.tell() # get end-of-file position
file.seek(0, 0) # go back to start of file
file.readline()
nextLine = True # maybe nextLine = (file.tell() != eof)
while nextLine:
file.readline()
# Do stuff
if file.tell() == eof:
nextLine = False
But as others have pointed out, you may do better by treating the file as an iterable, like this:
with open(f) as file:
next_line = next(file)
# next loop will terminate when next_line is '',
# i.e., after failing to read another line at end of file
while next_line:
# Do stuff
next_line = next(file)
Files are iterators over lines. If all you want to do is check whether a file has a line left, you can issue line = next(file) and catch the StopIeration raised in case there isn't another line. Alternatively you can use line = next(file, default) with a non-string default value (e.g. None) and then check against that.
Note that in most cases, you know that you are done when the for loop over the file ends, as the other answers have explained. So make sure you actually need that kind of fine grained control with next.
with open(filepath, 'rt+') as f:
for line in f.readlines():
#code to process each line
Opening it this way also closes it when it's finished which is much better on the overall memory usage, which might not matter depending on the file size.
The first lines is comparable to:
f = open(....)
f.readlines() gives you a list of all lines in the file.
The loop will start at the first line and end at then last line and shouldn't throw any errors regarding EOF for example.
[Edit]
notice the 'rt+' in the open method. As far as I'm aware this opens the file in read text mode. I.e. no decode required.

parse a file, appending each line at the end and removing the line from the top

I am trying to move each line down at the bottom of the file; this is how the file look like:
daodaos 12391039
idiejda 94093420
jfijdsf 10903213
....
#completed
So at the end of the parsing, I am planning to get all the entry that are on the top, under the actual string that says # completed.
The problem is that I am not sure how can I do this in one pass; I know that I can read the whole file, every single line, close the file and then re-open the file in write mode; searching for that line, removing it from the file and adding it to the end; but it feels incredibly inefficient.
Is there a way in one pass, to process the current line; then in the same for loop, delete the line and append it at the end of the file?
file = open('myfile.txt', 'a')
for items in file:
#process items line
#append items line to the end of the file
#remove items line from the file
suggest to keep it simple read and writeback
with open('myfile.txt') as f:
lines = f.readlines()
with open('myfile.txt', 'w') as f:
newlines = []
for line in lines:
# do you stuff, check if completed, rearrange the list
if line.startswith('#completed'):
idx=i
newlines = lines[idx:] + lines[:idx]
break
f.write(''.join(newlines)) # write back new lines
below is another version i could think of if insist wanna modify while reading
with open('myfile.txt', 'r+') as f:
newlines = ''
line = True
while line:
line = f.readline()
if line.startswith('#completed'):
# line += f.read() # uncomment this line if you interest on line after #completed
f.truncate()
f.seek(0)
f.write(line + newlines)
break
else:
newlines += line
Not really.
Your main problem here is that you're iterating on the file at the same time you want to change it. This will Do Bad Things (tm) to your processing, unless you plan to micro-manage the file position pointer.
You do have that power: the seek method lets you move to a given file location, expressed in bytes. seek(0) moves to the start of the file; seek(-1) to the end. The problem you face is that your for loop trusts that this pointer indicates the next line to read.
One distinct problem is that you can't just remove a line from the middle of the file; something exists in those bytes. Think of it as lines of text on a page, written in pencil. You can erase line 4, but this does not cause lines 5-end to magically float up half a centimeter; they're still in the same physical location.
How to Do It ... sort of
Read all of the lines into a list. You can easily change a list the way you want. When you hit the end, then write the list back to the file -- or use your magic seek and append powers to alter only a little of it.
I'll recommend you to do this the simple way: read all the file and store it in a variable, move the completed files to another variable and then rewrite your file.

Can read() and readline() be used together?

Is it possible to use both read() and readline() on one text file in python?
When I did that, it will only do the first reading function.
file = open(name, "r")
inside = file.readline()
inside2 = file.read()
print(name)
print(inside)
print(inside2)
The result shows only the inside variable, not inside2.
Reading a file is like reading a book. When you say .read(), it reads through the book until the end. If you say .read() again, well you forgot one step. You can't read it again unless you flip back the pages until you're at the beginning. If you say .readline(), we can call that a page. It tells you the contents of the page and then turns the page. Now, saying .read() starts there and reads to the end. That first page isn't included. If you want to start at the beginning, you need to turn back the page. The way to do that is with the .seek() method. It is given a single argument: a character position to seek to:
with open(name, 'r') as file:
inside = file.readline()
file.seek(0)
inside2 = file.read()
There is also another way to read information from the file. It is used under the hood when you use a for loop:
with open(name) as file:
for line in file:
...
That way is next(file), which gives you the next line. This way is a little special, though. If file.readline() or file.read() comes after next(file), you will get an error that mixing iteration and read methods would lose data. (Credits to Sven Marnach for pointing this out.)
Yes you can.
file.readline() reads a line from the file (the first line in this case), and then file.read() reads the rest of the file starting from the seek position, in this case, where file.readline() left off.
You are receiving an empty string with f.read() probably because you reached EOF - End of File immediately after reading the first line with file.readline() implying your file only contains one line.
You can however return to the start of the file by moving the seek position to the start with f.seek(0).

Can you read first line from file with open(fname, 'a+')?

I want to be able to open a file, append some text to the end, and then read only the first line. I know exactly how long the first line of the file is, and the file is large enough that I don't want to read it into memory all at once. I've tried using:
with open('./output files/log.txt', 'a+') as f:
f.write('This is example text')
content = f.readline()
print(content)
but the print statement is blank. When I try using open('./output files/log.txt') or open('./output files/log.txt', 'r+') instead of open('./output files/log.txt', 'a+') this works so I know it has to do with the 'a+ argument. My problem is that I have to append to the file. How can I append to the file and still get the first line without using something like
with open('./output files/log.txt', 'a+') as f_1:
f.write('This is example text')
with open('./output files/log.txt') as f_2:
content = f_2.readline()
print(content)
When you open a file with the append flag a, it moves the file descriptor's pointer to the end of the file, so that the write call will add to the end of the file.
The readline() function reads from the current pointer of the file until the next '\n' character it reads. So when you open a file with append, and then call readline, it will try to read a line starting from the end of the file. This is why your print call is coming up blank.
You can see this in action by looking at where the file object is currently pointing, using the tell() function.
To read the first line, you'd have to make sure the file's pointer is back at the beginning of the file, which you can do using the seek function. seek takes two arguments: offset and from_what. If you omit the second argument, offset is taken from the beginning of the file. So to jump to the beginning of the file, do: seek(0).
If you want to jump back to the end of the file, you can include the from_what option. from_what=2 means take the offset from the end of the file. So to jump to the end: seek(0, 2).
Demonstration of file pointers when opened in append mode:
Example using a text file that looks like this:
the first line of the file
and the last line
Code:
with open('example.txt', 'a+') as fd:
print fd.tell() # at end of file
fd.write('example line\n')
print fd.tell() # at new end of the file after writing
# jump to the beginning of the file:
fd.seek(0)
print fd.readline()
# jump back to the end of the file
fd.seek(0, 2)
fd.write('went back to the end')
console output:
45
57
the first line of the file
new contents of example.txt:
the first line of the file
and the last line
example line
went back to the end
Edit: added jumping back to end of file
You need to go back to the start of the file using seek(0), like so:
with open('./output files/log.txt', 'a+') as f_1:
f_1.write('This is example text')
f_1.seek(0)
print(f_1.readline())

Categories