I would like to read a file again and again when it arrives at the end.
The file is only numbers separate by comma.
I use python and I read on the doc that file.seek(0) can be use for this but doesn't work for me.
This is my script:
self.users = []
self.index = -1
infile = open(filename, "r")
for line in infile.readlines():
if line != None:
self.users.append(String.split((line),','))
else:
infile.seek(0)
infile.read()
infile.close()
self.index= self._index +1
return self.users[self.index]
Thank you for your help
infile.read() will read in the whole of the file and then throw away the result. Why are you doing it?
When you call infile.readlines you have already read in the whole file. Then your loop iterates over the result, which is just a Python list. Moving to the start of the file will have no effect on that.
If your code did in fact move to the start of the file after reaching the end, it would simply loop for ever until it ran out of memory (because of the endlessly growing users list).
You could get the behaviour you're asking for by storing the result of readlines() in a variable and then putting the whole for line in all_lines: loop inside another while True:. (Or closing, re-opening and re-reading every time, if (a) you are worried that the file might be changed by another program or (b) you want to avoid reading it all in in a single gulp. For (b) you would replace for line in infile.readlines(): with for line in infile:. For (a), note that trying to read a file while something else might be writing to it is likely to be a bad idea no matter how you do it.)
I strongly suspect that the behaviour you're asking for is not what you really want. What's the goal you're trying to achieve by making your program keep reading the file over and over?
The 'else' branch will never be pursued because the for loop will iterate over all the lines of the files and then exit.
If you want the seek operation to be executed you will have to put it outside the for loop
self.users = []
self.index = -1
infile = open(filename, "r")
while True:
for line in infile.readlines():
self.users.append(String.split((line),','))
infile.seek(0)
infile.close()
self.index= self._index +1
return self.users[self.index]
The problem is, if you will loop for ever you will exhaust the memory. If you want to read it only twice then copy and paste the for loop, otherwise decide an exit condition and use a break operation.
readlines is already reading the entire file contents into an in-memory list, which you are free to iterate over again and again!
To re-read the file do:
infile = file('whatever')
while True:
content = infile.readlines()
# do something with list 'content'
# re-read the file - why? I do not know
infile.seek(0)
infile.close()
You can use itertools.cycle() here.
Here's an example :
import itertools
f = open(filename)
lines = f.readlines()
f.close()
for line in itertools.cycle(lines):
print line,
Related
Trying to do a college exercise where I'm supposed to replace a given line in a file, by the same line but written in all caps. The problem is we can only write in the same file, and in that exact line, we can't write in the rest of the file.
This is the code I have so far, but I can't figure out how to go to the line I want
def upper(n):
count=0
with open("upper.txt", "r+") as file:
lines = file.readlines()
file.seek(0)
for line in file.readlines():
if count == n:
pos = file.tell()
line1 = str(line.upper())
count += 1
file.seek(pos)
file.write(line1)
Help appreciated!
The problem lies in that your readlines already has read the entire file, and so the position of the "file cursor" is always at the end of the file. In theory, a simple fix should be:
Initialize pos to 0.
Read a single line.
If the current line counter indicates this is the one you want, set the position to pos again, update that line, and exit.
Update pos to point to the end of this line (so it points to the start of the next line).
Loop until satisfied.
In code, that would be this:
def upper(n):
count=0
with open("text.txt", "r+") as file:
pos = 0
for line in file.readlines():
if count == n:
line1 = line.upper()
break
pos = file.tell()
count += 1
file.seek(pos)
file.write(line1)
upper(5)
However! There is a snag. File operations are heavily buffered, and the for loop on readlines does not read one line at a time. Instead, for efficiency, it reads as much as possible, but it only "returns" the next line to your program. On a next run through your loop, it simply checks if it already had read enough of your text file to return the following line, and if not, it fills its internal buffer again. So, even while tell() will correctly be updated to the external file position – the value you see –, it does not reflect the "cursor" position of what you are processing at the time.
One way to circumvent this is to physically mimic what readlines does: read a single byte at a time, determine whether you have read an entire line (then this byte would be \n), and update your position and status based on this.
However, a more proper way of updating a file is to read it into memory in its entirety, change it, and write it back to disk. Changing part of an existing file with "r+" is usually recommended to use binary mode (where the position of each byte is known beforehand); admittedly, in theory your method should have worked as well, but as you see the file buffering defeats this.
Reading, changing, and writing the file entirely is as simple as this:
def better_upper(n):
count=0
with open("text.txt", "r") as file:
lines = file.readlines()
lines[n] = lines[n].upper()
with open("text.txt", "w") as file:
file.writelines(lines)
better_upper(5)
(Where the only caveat is that it always overwrites the original file. That is: if something unexpected goes wrong, it will probably erase text.txt. If you want a belt-and-suspenders approach, write to a new file, then check if it got written correctly. If it did, delete the old file and rename the new one. Left as an exercise to the reader.)
Is it possible to use both read() and readline() on one text file in python?
When I did that, it will only do the first reading function.
file = open(name, "r")
inside = file.readline()
inside2 = file.read()
print(name)
print(inside)
print(inside2)
The result shows only the inside variable, not inside2.
Reading a file is like reading a book. When you say .read(), it reads through the book until the end. If you say .read() again, well you forgot one step. You can't read it again unless you flip back the pages until you're at the beginning. If you say .readline(), we can call that a page. It tells you the contents of the page and then turns the page. Now, saying .read() starts there and reads to the end. That first page isn't included. If you want to start at the beginning, you need to turn back the page. The way to do that is with the .seek() method. It is given a single argument: a character position to seek to:
with open(name, 'r') as file:
inside = file.readline()
file.seek(0)
inside2 = file.read()
There is also another way to read information from the file. It is used under the hood when you use a for loop:
with open(name) as file:
for line in file:
...
That way is next(file), which gives you the next line. This way is a little special, though. If file.readline() or file.read() comes after next(file), you will get an error that mixing iteration and read methods would lose data. (Credits to Sven Marnach for pointing this out.)
Yes you can.
file.readline() reads a line from the file (the first line in this case), and then file.read() reads the rest of the file starting from the seek position, in this case, where file.readline() left off.
You are receiving an empty string with f.read() probably because you reached EOF - End of File immediately after reading the first line with file.readline() implying your file only contains one line.
You can however return to the start of the file by moving the seek position to the start with f.seek(0).
Let's say I have a file source.txt containing a few rows.
I want to print rows over and over until I break the program manually.
file_source = 'source.txt'
source = open(file_source,'r')
while 1:
for line in source:
print line
source.close()
The easiest solution is fut open and close into while loop. By my feeling is that's not the best solution.
Can you suggest something better?
How to loop over variable source many times?
Regards
I wasn't sure this would work, but it appears you can just seek to the beginning of the file and then continue iterating:
file_source = 'source.txt'
source = open(file_source,'r')
while 1:
for line in source:
print line
source.seek(0)
source.close()
And obviously if the file is small you could simply read the whole thing into a list in memory and iterate over that instead.
You can read the lines at first and save them into a list. So your file is closed after reading. Then you can proceed with your infinite loop:
lines = []
with open(file_source, 'rb') as f:
lines = f.readlines()
while 1:
for line in lines:
print line
But, this is not advised if your file is very large since everything from the file will be read into the memory:
file.readlines([sizehint]):
Read until EOF using readline() and return a list containing the lines thus read.
I'm trying to write a Python script that uses a particular external application belonging to the company I work for. I can generally figure things out for myself when it comes to programming and scripting, but this time I am truely lost!
I can't seem to figure out why the while loop wont function as it is meant to. It doesn't give any errors which doesn't help me. It just seems to skip past the important part of the code in the centre of the loop and then goes on to increment the "count" like it should afterwards!
f = open('C:/tmp/tmp1.txt', 'w') #Create a tempory textfile
f.write("TEXTFILE\nTEXTFILE\nTEXTFILE\nTEXTFILE\nTEXTFILE\nTEXTFILE\n") #Put some simple text in there
f.close() #Close the file
count = 0 #Insert the line number from the text file you want to begin with (first line starts with 0)
num_lines = sum(1 for line1 in open('C:/tmp/tmp1.txt')) #Get the number of lines from the textfile
f = open('C:/tmp/tmp2.txt', 'w') #Create a new textfile
f.close() #Close it
while (count < num_lines): #Keep the loop within the starting line and total number of lines from the first text file
with open('C:/tmp/tmp1.txt', 'r') as f: #Open the first textfile
line2 = f.readlines() #Read these lines for later input
for line2[count] in f: #For each line from chosen starting line until last line from first text file,...
with open('C:/tmp/tmp2.txt', 'a') as g: #...with the second textfile open for appending strings,...
g.write("hello\n") #...write 'hello\n' each time while "count" < "num_lines"
count = count + 1 #Increment the "count"
I think everything works up until: "for line2[count] in f:"
The real code I'm working on is somewhat more complicated, and the application I'm using isn't exactly for sharing, so I have simplified the code to give silly outputs instead just to fix the problem.
I'm not looking for alternative code, I'm just looking for a reason why the loop isn't working so I can try to fix it myself.
All answers will be appreciated, and thanking everyone in advance!
Cormac
Some comments:
num_lines = sum(1 for line1 in open('C:/tmp/tmp1.txt'))
Why? What's wrong with len(open(filename, 'rb').readlines())?
while (count < num_lines):
...
count = count + 1
This is bad style, you could use:
for i in range(num_lines):
...
Note that I named your index i, which is universally recognized, and that I used range and a for loop.
Now, your problem, like I said in the comment, is that f is a file (that is, a stream of bytes with a location pointer) and you've read all the lines from it. So when you do for line2[count] in f:, it will try reading a line into line2[count] (this is a bit weird, actually, you almost never use a for loop with a list member as an index but apparently you can do that), see that there's no line to read, and never executes what's inside the loop.
Anyway, you want to read a file, line by line, starting from a given line number? Here's a better way to do that:
from itertools import islice
start_line = 0 # change this
filename = "foobar" # also this
with open(filename, 'rb') as f:
for line in islice(f, start_line, None):
print(line)
I realize you don't want alternative code, but your code really is needlessly complicated.
If you want to iterate over the lines in the file f, I suggest replacing your "for" line with
for line in line2:
# do something with "line"...
You put the lines in an array called line2, so use that array! Using line2[count] as a loop variable doesn't make sense to me.
You seem to get it wrong how the 'for line in f' loop works. It iterates over a file and calls readline, until there are no lines to read. But at the moment you start the loop all the lines are already read(via f.readlines()) and file's current position is at end. You can achieve what you want by calling f.seek(0), but that doesn't seem to be a good decision anyway, since you're going to read file again and that's slow IO.
Instead you want to do smth like:
for line in line2[count:]: # iterate over lines read, starting with `count` line
do_smth_with(line)
Very new to python and can't understand why this isn't working. I have a list of web addresses stored line by line in a text file. I want to store the first 10 in an array/list called bing, the next 10 in a list called yahoo, and the last 10 in a list called duckgo. I'm using the readlines function to read the data from the file into each array. The problem is nothing is being written to the lists. The count is incrementing like it should. Also, if I remove the loops altogether and just read the whole text file into one list it works perfectly. This leads me to believe that the loops are causing the problem. The code I am using is below. Would really appreciate some feedback.
count=0;
#Open the file
fo=open("results.txt","r")
#read into each array
while(count<30):
if(count<10):
bing = fo.readlines()
count+=1
print bing
print count
elif(count>=10 and count<=19):
yahoo = fo.readlines()
count+=1
print count
elif(count>=20 and count<=29):
duckgo = fo.readlines()
count+=1
print count
print bing
print yahoo
print duckgo
fo.close
You're using readlines to read the files. readlines reads all of the lines at once, so the very first time through your loop, you exhaust the entire file and store the result in bing. Then, every time through the loop, you overwrite bing, yahoo, or duckgo with the (empty) result of the next readlines call. So your lists all wind up being empty.
There are lots of ways to fix this. Among other things, you should consider reading the file a line at a time, with readline (no 's'). Or better yet, you could iterate over the file, line by line, simply by using a for loop:
for line in fo:
...
To keep the structure of your current code you could use enumerate:
for line_number, line in enumerate(fo):
if condition(line_number):
...
But frankly I think you should ditch your current system. A much simpler way would be to use readlines without a loop, and slice the resulting list!
lines = fo.readlines()
bing = lines[0:10]
yahoo = lines[10:20]
duckgo = lines[20:30]
There are many other ways to do this, and some might be better, but none are simpler!
readlines() reads all of the lines of the file. If you call it again, you get empty list. So you are overwriting your lists with empty data when you iterate through your loop.
You should be using readline() instead of readlines()
readlines() reads the entire file in at once, whereas readline() reads a single line from the file.
I suggest you rewrite it like so:
bing = []
yahoo = []
duckgo = []
with open("results.txt", "r") as f:
for i, line in enumerate(f):
if i < 10:
bing.append(line)
elif i < 20:
yahoo.append(line)
elif i < 30:
duckgo.append(line)
else:
raise RuntimeError, "too many lines in input file"
Note how we use enumerate() to get a running count of lines, rather than making our own count variable and needing to increment it ourselves. This is considered good style in Python.
But I think the best way to solve this problem would be to use itertools like so:
import itertools as it
with open("results.txt", "r") as f:
bing = list(it.islice(f, 10))
yahoo = list(it.islice(f, 10))
duckgo = list(it.islice(f, 10))
if list(it.islice(f, 1)):
raise RuntimeError, "too many lines in input file"
itertools.islice() (or it.islice() since I did the import itertools as it) will pull a specified number of items from an iterator. Our open file-handle object f is an iterator that returns lines from the file, so it.islice(f, 10) pulls exactly 10 lines from the input file.
Because it.islice() returns an iterator, we must explicitly expand it out to a list by wrapping it in list().
I think this is the simplest way to do it. It perfectly expresses what we want: for each one, we want a list with 10 lines from the file. There is no need to keep a counter at all, just pull the 10 lines each time!
EDIT: The check for extra lines now uses it.islice(f, 1) so that it will only pull a single line. Even one extra line is enough to know that there are more than just the 30 expected lines, and this way if someone accidentally runs this code on a very large file, it won't try to slurp the whole file into memory.