I am creating a file editing system and would like to make a line based tell() function instead of a byte based one. This function would be used inside of a "with loop" with the open(file) call. This function is part of a class that has:
self.f = open(self.file, 'a+')
# self.file is a string that has the filename in it
The following is the original function
(It also has a char setting if you wanted line and byte return):
def tell(self, char=False):
t, lc = self.f.tell(), 0
self.f.seek(0)
for line in self.f:
if t >= len(line):
t -= len(line)
lc += 1
else:
break
if char:
return lc, t
return lc
The problem I'm having with this is that this returns an OSError and it has to do with how the system is iterating over the file but I don't understand the issue. Thanks to anyone who can help.
I don't know if this was the original error but you can get the same error if you try to call f.tell() inside of a line-by-line iteration of a file like so:
with open(path, "r+") as f:
for line in f:
f.tell() #OSError
which can be easily substituted by the following:
with open(path, mode) as f:
line = f.readline()
while line:
f.tell() #returns the location of the next line
line = f.readline()
I have an older version of Python 3, and I'm on Linux instead of a Mac, but I was able to recreate something very close to your error:
IOError: telling position disabled by next() call
An IO error, not an OS error, but otherwise the same. Bizarrely enough, I couldn't cause it using your open('a+', ...), but only when opening the file in read mode: open('r+', ...).
Further muddling things is that the error comes from _io.TextIOWrapper, a class that appears to be defined in Python's _pyio.py file... I stress "appears", because:
The TextIOWrapper in that file has attributes like _telling that I can't access on the whatever-it-is object calling itself _io.TextIOWrapper.
The TextIOWrapper class in _pyio.py doesn't make any distinction between readable, writable, or random-access files. Either both should work, or both should raise the same IOError.
Regardless, the TextIOWrapper class as described in the _pyio.py file disables the tell method while the iteration is in progress. This seems to be what you're running into (comments are mine):
def __next__(self):
# Disable the tell method.
self._telling = False
line = self.readline()
if not line:
# We've reached the end of the file...
self._snapshot = None
# ...so restore _telling to whatever it was.
self._telling = self._seekable
raise StopIteration
return line
In your tell method, you almost always break out of the iteration before it reaches the end of the file, leaving _telling disabled (False):
One other way to reset _telling is the flush method, but it also failed if called while the iteration was in progress:
IOError: can't reconstruct logical file position
The way around this, at least on my system, is to call seek(0) on the TextIOWrapper, which restores everything to a known state (and successfully calls flush in the bargain):
def tell(self, char=False):
t, lc = self.f.tell(), 0
self.f.seek(0)
for line in self.f:
if t >= len(line):
t -= len(line)
lc += 1
else:
break
# Reset the file iterator, or later calls to f.tell will
# raise an IOError or OSError:
f.seek(0)
if char:
return lc, t
return lc
If that's not the solution for your system, it might at least tell you where to start looking.
PS: You should consider always returning both the line number and the character offset. Functions that can return completely different types are hard to deal with --- it's a lot easier for the caller to just throw away the value her or she doesn't need.
Just a quick workaround for this issue:
As you are iterating over the file from the beginning anyways, just keep track of where you are with a dedicated variable:
file_pos = 0
with open('file.txt', 'rb') as f:
for line in f:
# process line
file_pos += len(line)
Now file_pos will always be, what file.tell() would tell you. Note that this only works for ASCII files as tell and seek work with byte positions. Working on a line-basis it's easy though to convert strings from byte to unicode-strings.
I had the same error: OSError: telling position disabled by next() call, and solved it by adding the 'rb' mode while opening the file.
The error message is pretty clear, but missing one detail: calling next on a text file object disables the tell method. A for loop repeatedly calls next on iter(f), which happens to be f itself for a file. I ran into a similar issue trying to call tell inside the loop instead of calling your function twice.
An alternative solution is to iterate over the file without using the built-in file iterator. Instead, you can bake a nearly equally efficient iterator from the arcane two-arg form of the iter function:
for line in iter(f.readline, ''):
Related
Trying to do a college exercise where I'm supposed to replace a given line in a file, by the same line but written in all caps. The problem is we can only write in the same file, and in that exact line, we can't write in the rest of the file.
This is the code I have so far, but I can't figure out how to go to the line I want
def upper(n):
count=0
with open("upper.txt", "r+") as file:
lines = file.readlines()
file.seek(0)
for line in file.readlines():
if count == n:
pos = file.tell()
line1 = str(line.upper())
count += 1
file.seek(pos)
file.write(line1)
Help appreciated!
The problem lies in that your readlines already has read the entire file, and so the position of the "file cursor" is always at the end of the file. In theory, a simple fix should be:
Initialize pos to 0.
Read a single line.
If the current line counter indicates this is the one you want, set the position to pos again, update that line, and exit.
Update pos to point to the end of this line (so it points to the start of the next line).
Loop until satisfied.
In code, that would be this:
def upper(n):
count=0
with open("text.txt", "r+") as file:
pos = 0
for line in file.readlines():
if count == n:
line1 = line.upper()
break
pos = file.tell()
count += 1
file.seek(pos)
file.write(line1)
upper(5)
However! There is a snag. File operations are heavily buffered, and the for loop on readlines does not read one line at a time. Instead, for efficiency, it reads as much as possible, but it only "returns" the next line to your program. On a next run through your loop, it simply checks if it already had read enough of your text file to return the following line, and if not, it fills its internal buffer again. So, even while tell() will correctly be updated to the external file position – the value you see –, it does not reflect the "cursor" position of what you are processing at the time.
One way to circumvent this is to physically mimic what readlines does: read a single byte at a time, determine whether you have read an entire line (then this byte would be \n), and update your position and status based on this.
However, a more proper way of updating a file is to read it into memory in its entirety, change it, and write it back to disk. Changing part of an existing file with "r+" is usually recommended to use binary mode (where the position of each byte is known beforehand); admittedly, in theory your method should have worked as well, but as you see the file buffering defeats this.
Reading, changing, and writing the file entirely is as simple as this:
def better_upper(n):
count=0
with open("text.txt", "r") as file:
lines = file.readlines()
lines[n] = lines[n].upper()
with open("text.txt", "w") as file:
file.writelines(lines)
better_upper(5)
(Where the only caveat is that it always overwrites the original file. That is: if something unexpected goes wrong, it will probably erase text.txt. If you want a belt-and-suspenders approach, write to a new file, then check if it got written correctly. If it did, delete the old file and rename the new one. Left as an exercise to the reader.)
I am reading each line of a file and performing some operations on it. Sometimes the program throws an error due to some strange behavior in the network(It does SSH to a remote machine). This occurs once in a while. I want to catch this error and perform the same operations again on the same line. To be specific, I want to read the same line again. I am looking for something like this.
with open (file_name) as f:
for line in f:
try:
do this
except IndexError:
go back and read the same line again from the file.
As long as you’re within the block of your for loop, you still have access to that line (unless you modified it knowingly of course). So you don’t actually need to reread it from the file but you just still have it in memory.
You could for example try to “do this” repeatedly until it succeeds like this:
for line in f:
while True:
try:
print(line)
doThis()
except IndexError:
# we got an error, so let’s rerun this inner while loop
pass
else:
# if we don’t get an error, abort the inner while loop
# to get to the next line
break
You don't need to re-read the line. The line variable is holding your line. What you want to do is retry your operation in case it fails. One way would be to use a function and call the function from the function whenever it fails.
def do(line):
try:
pass # your "do this" code here
except IndexError:
do(line)
with open (file_name) as f:
for line in f:
do(line)
Python does not have a 'repeat' keyword that resets the execution pointer to the beginning of the current iteration. Your best approach is probably to look again at the structure of your code and break down 'do this' into a function that retries until it completes.
But if you are really set on emulating a repeat keyword as closely as possible, we can implement this by wrapping the file object in a generator
Rather than looping directly over the file, define a generator the yields from the file one line at a time, with a repeat option.
def repeating_generator(iterator_in):
for x in iterator_in:
repeat = True
while repeat:
repeat = yield x
yield
Your file object can be wrapped with this generator. We pass a flag back into the generator telling it whether to repeat the previous line, or continue to the next one...
with open (file_name) as f:
r = repeating_generator(f)
for line in r:
try:
#do this
r.send(False) # Don't repeat
except IndexError:
r.send(True) #go back and read the same line again from the file.
Take a look at this question to see whats happening here. I don't think this is the most readable way of doing this, consider the alternatives first! Note that you will need Python 2.7 or later to be able to use this.
I want to read huge text file line by line (and stop if a line with "str" found).
How to check, if file-end is reached?
fn = 't.log'
f = open(fn, 'r')
while not _is_eof(f): ## how to check that end is reached?
s = f.readline()
print s
if "str" in s: break
There's no need to check for EOF in python, simply do:
with open('t.ini') as f:
for line in f:
# For Python3, use print(line)
print line
if 'str' in line:
break
Why the with statement:
It is good practice to use the with keyword when dealing with file
objects. This has the advantage that the file is properly closed after
its suite finishes, even if an exception is raised on the way.
Just iterate over each line in the file. Python automatically checks for the End of file and closes the file for you (using the with syntax).
with open('fileName', 'r') as f:
for line in f:
if 'str' in line:
break
There are situations where you can't use the (quite convincing) with... for... structure. In that case, do the following:
line = self.fo.readline()
if len(line) != 0:
if 'str' in line:
break
This will work because the the readline() leaves a trailing newline character, where as EOF is just an empty string.
You can stop the 2-line separation in the output by using
with open('t.ini') as f:
for line in f:
print line.strip()
if 'str' in line:
break
The simplest way to read a file one line at a time is this:
for line in open('fileName'):
if 'str' in line:
break
No need for a with-statement or explicit close. Notice no variable 'f' that refers to the file. In this case python assigns the result of the open() to a hidden, temporary variable. When the for loop ends (no matter how -- end-of-file, break or exception), the temporary variable goes out of scope and is deleted; its destructor will then close the file.
This works as long as you don't need to explicitly access the file in the loop, i.e., no need for seek, flush, or similar. Should also note that this relies on python using a reference counting garbage collector, which deletes an object as soon as its reference count goes to zero.
If i've done the following:
import codecs
lines = codecs.open(somefile, 'r','utf8').readlines()
Is there a way to close the file that i've not initialized? If so, how? Normally, i could have done:
import codecs
reader = codecs.open(somefile, 'r','utf8')
lines = reader.readlines()
reader.close()
In CPython, the file object will close on its own once the reference count drops to 0, which is right after .readlines() returns. For other Python implementations it may take a little longer depending on the garbage collection algorithm used. The file is certainly going to be closed no later than program exit.
You should really use the file object as a context manager and have the with statement call close on it:
with codecs.open(somefile, 'r','utf8') as reader:
lines = reader.readlines()
As soon as the block of code indented under the with statement exits (be it with an exception, a return, continue or break statement, or simply because all code in the block finished executing), the reader file object will be closed.
Bonus tip: file objects are iterables, so the following also works:
with codecs.open(somefile, 'r','utf8') as reader:
lines = list(reader)
for the exact same result.
I would like to read a file again and again when it arrives at the end.
The file is only numbers separate by comma.
I use python and I read on the doc that file.seek(0) can be use for this but doesn't work for me.
This is my script:
self.users = []
self.index = -1
infile = open(filename, "r")
for line in infile.readlines():
if line != None:
self.users.append(String.split((line),','))
else:
infile.seek(0)
infile.read()
infile.close()
self.index= self._index +1
return self.users[self.index]
Thank you for your help
infile.read() will read in the whole of the file and then throw away the result. Why are you doing it?
When you call infile.readlines you have already read in the whole file. Then your loop iterates over the result, which is just a Python list. Moving to the start of the file will have no effect on that.
If your code did in fact move to the start of the file after reaching the end, it would simply loop for ever until it ran out of memory (because of the endlessly growing users list).
You could get the behaviour you're asking for by storing the result of readlines() in a variable and then putting the whole for line in all_lines: loop inside another while True:. (Or closing, re-opening and re-reading every time, if (a) you are worried that the file might be changed by another program or (b) you want to avoid reading it all in in a single gulp. For (b) you would replace for line in infile.readlines(): with for line in infile:. For (a), note that trying to read a file while something else might be writing to it is likely to be a bad idea no matter how you do it.)
I strongly suspect that the behaviour you're asking for is not what you really want. What's the goal you're trying to achieve by making your program keep reading the file over and over?
The 'else' branch will never be pursued because the for loop will iterate over all the lines of the files and then exit.
If you want the seek operation to be executed you will have to put it outside the for loop
self.users = []
self.index = -1
infile = open(filename, "r")
while True:
for line in infile.readlines():
self.users.append(String.split((line),','))
infile.seek(0)
infile.close()
self.index= self._index +1
return self.users[self.index]
The problem is, if you will loop for ever you will exhaust the memory. If you want to read it only twice then copy and paste the for loop, otherwise decide an exit condition and use a break operation.
readlines is already reading the entire file contents into an in-memory list, which you are free to iterate over again and again!
To re-read the file do:
infile = file('whatever')
while True:
content = infile.readlines()
# do something with list 'content'
# re-read the file - why? I do not know
infile.seek(0)
infile.close()
You can use itertools.cycle() here.
Here's an example :
import itertools
f = open(filename)
lines = f.readlines()
f.close()
for line in itertools.cycle(lines):
print line,