seek() function? - python

Please excuse my confusion here but I have read the documentation regarding the seek() function in python (after having to use it) and although it helped me I am still a bit confused on the actual meaning of what it does, any explanations are much appreciated, thank you.

Regarding seek() there's not too much to worry about.
First of all, it is useful when operating over an open file.
It's important to note that its syntax is as follows:
fp.seek(offset, from_what)
where fp is the file pointer you're working with; offset means how many positions you will move; from_what defines your point of reference:
0: means your reference point is the beginning of the file
1: means your reference point is the current file position
2: means your reference point is the end of the file
if omitted, from_what defaults to 0.
Never forget that when managing files, there'll always be a position inside that file where you are currently working on. When just open, that position is the beginning of the file, but as you work with it, you may advance.
seek will be useful to you when you need to walk along that open file, just as a path you are traveling into.

When you open a file, the system points to the beginning of the file. Any read or write you do will happen from the beginning. A seek() operation moves that pointer to some other part of the file so you can read or write at that place.
So, if you want to read the whole file but skip the first 20 bytes, open the file, seek(20) to move to where you want to start reading, then continue with reading the file.
Or say you want to read every 10th byte, you could write a loop that does seek(9, 1) (moves 9 bytes forward relative to the current positions), read(1) (reads one byte), repeat.

The seek function expect's an offset in bytes.
Ascii File Example:
So if you have a text file with the following content:
simple.txt
abc
You can jump 1 byte to skip over the first character as following:
fp = open('simple.txt', 'r')
fp.seek(1)
print fp.readline()
>>> bc
Binary file example gathering width :
fp = open('afile.png', 'rb')
fp.seek(16)
print 'width: {0}'.format(struct.unpack('>i', fp.read(4))[0])
print 'height: ', struct.unpack('>i', fp.read(4))[0]
Note: Once you call read you are changing the position of the
read-head, which act's like seek.

For strings, forget about using WHENCE: use f.seek(0) to position at beginning of file and f.seek(len(f)+1) to position at the end of file. Use open(file, "r+") to read/write anywhere in a file. If you use "a+" you'll only be able to write (append) at the end of the file regardless of where you position the cursor.

Related

Is there a buffer when one uses the 'readline()' method? Can i access previously 'read/accessed' lines of txt? [duplicate]

For an exercise I'm doing, I'm trying to read the contents of a given file twice using the read() method. Strangely, when I call it the second time, it doesn't seem to return the file content as a string?
Here's the code
f = f.open()
# get the year
match = re.search(r'Popularity in (\d+)', f.read())
if match:
print match.group(1)
# get all the names
matches = re.findall(r'<td>(\d+)</td><td>(\w+)</td><td>(\w+)</td>', f.read())
if matches:
# matches is always None
Of course I know that this is not the most efficient or best way, this is not the point here. The point is, why can't I call read() twice? Do I have to reset the file handle? Or close / reopen the file in order to do that?
Calling read() reads through the entire file and leaves the read cursor at the end of the file (with nothing more to read). If you are looking to read a certain number of lines at a time you could use readline(), readlines() or iterate through lines with for line in handle:.
To answer your question directly, once a file has been read, with read() you can use seek(0) to return the read cursor to the start of the file (docs are here). If you know the file isn't going to be too large, you can also save the read() output to a variable, using it in your findall expressions.
Ps. Don't forget to close the file after you are done with it.
As other answers suggested, you should use seek().
I'll just write an example:
>>> a = open('file.txt')
>>> a.read()
#output
>>> a.seek(0)
>>> a.read()
#same output
Everyone who has answered this question so far is absolutely right - read() moves through the file, so after you've called it, you can't call it again.
What I'll add is that in your particular case, you don't need to seek back to the start or reopen the file, you can just store the text that you've read in a local variable, and use it twice, or as many times as you like, in your program:
f = f.open()
text = f.read() # read the file into a local variable
# get the year
match = re.search(r'Popularity in (\d+)', text)
if match:
print match.group(1)
# get all the names
matches = re.findall(r'<td>(\d+)</td><td>(\w+)</td><td>(\w+)</td>', text)
if matches:
# matches will now not always be None
The read pointer moves to after the last read byte/character. Use the seek() method to rewind the read pointer to the beginning.
Every open file has an associated position.
When you read() you read from that position.
For example read(10) reads the first 10 bytes from a newly opened file, then another read(10) reads the next 10 bytes.
read() without arguments reads all of the contents of the file, leaving the file position at the end of the file. Next time you call read() there is nothing to read.
You can use seek to move the file position. Or probably better in your case would be to do one read() and keep the result for both searches.
read() consumes. So, you could reset the file, or seek to the start before re-reading. Or, if it suites your task, you can use read(n) to consume only n bytes.
I always find the read method something of a walk down a dark alley. You go down a bit and stop but if you are not counting your steps you are not sure how far along you are. Seek gives the solution by repositioning, the other option is Tell which returns the position along the file. May be the Python file api can combine read and seek into a read_from(position,bytes) to make it simpler - till that happens you should read this page.

Python not looking in file for string? [duplicate]

For an exercise I'm doing, I'm trying to read the contents of a given file twice using the read() method. Strangely, when I call it the second time, it doesn't seem to return the file content as a string?
Here's the code
f = f.open()
# get the year
match = re.search(r'Popularity in (\d+)', f.read())
if match:
print match.group(1)
# get all the names
matches = re.findall(r'<td>(\d+)</td><td>(\w+)</td><td>(\w+)</td>', f.read())
if matches:
# matches is always None
Of course I know that this is not the most efficient or best way, this is not the point here. The point is, why can't I call read() twice? Do I have to reset the file handle? Or close / reopen the file in order to do that?
Calling read() reads through the entire file and leaves the read cursor at the end of the file (with nothing more to read). If you are looking to read a certain number of lines at a time you could use readline(), readlines() or iterate through lines with for line in handle:.
To answer your question directly, once a file has been read, with read() you can use seek(0) to return the read cursor to the start of the file (docs are here). If you know the file isn't going to be too large, you can also save the read() output to a variable, using it in your findall expressions.
Ps. Don't forget to close the file after you are done with it.
As other answers suggested, you should use seek().
I'll just write an example:
>>> a = open('file.txt')
>>> a.read()
#output
>>> a.seek(0)
>>> a.read()
#same output
Everyone who has answered this question so far is absolutely right - read() moves through the file, so after you've called it, you can't call it again.
What I'll add is that in your particular case, you don't need to seek back to the start or reopen the file, you can just store the text that you've read in a local variable, and use it twice, or as many times as you like, in your program:
f = f.open()
text = f.read() # read the file into a local variable
# get the year
match = re.search(r'Popularity in (\d+)', text)
if match:
print match.group(1)
# get all the names
matches = re.findall(r'<td>(\d+)</td><td>(\w+)</td><td>(\w+)</td>', text)
if matches:
# matches will now not always be None
The read pointer moves to after the last read byte/character. Use the seek() method to rewind the read pointer to the beginning.
Every open file has an associated position.
When you read() you read from that position.
For example read(10) reads the first 10 bytes from a newly opened file, then another read(10) reads the next 10 bytes.
read() without arguments reads all of the contents of the file, leaving the file position at the end of the file. Next time you call read() there is nothing to read.
You can use seek to move the file position. Or probably better in your case would be to do one read() and keep the result for both searches.
read() consumes. So, you could reset the file, or seek to the start before re-reading. Or, if it suites your task, you can use read(n) to consume only n bytes.
I always find the read method something of a walk down a dark alley. You go down a bit and stop but if you are not counting your steps you are not sure how far along you are. Seek gives the solution by repositioning, the other option is Tell which returns the position along the file. May be the Python file api can combine read and seek into a read_from(position,bytes) to make it simpler - till that happens you should read this page.

Back to the top of file?

I have a program where I'm reading numbers from a file and adding those numbers to different lists in my program. Now, I need to jump back to the top of the file and read from the top again. Does anyone know if there is a command that does that or if its even possible?
You can use seek(0) to start from the beginning all over again.
Actually, when you read from a file, it continuously updates the offset to the current bytes. seek() provides you with the ability to set the offset at any position.
At the beginning, offset is located at 0. So, f.seek(0) will set the offset at the beginning of the file.
with open('filename','r') as f:
f.read(100) # Read first 100 byte
f.seek(0) # set the offset at the beginning
f.read(50) # Read first 50 byte again.
There are couple of ways to achieve this. Simplest one is using seek(0). 0 represents start of the file.
You can even use tell() to store any position of the file and reuse it:
with open('file.txt', 'r') as f:
first_position = f.tell()
f.read() # your read
f.seek(first_position) # it will take you to the previous position you marked.

Python Reading numbers from text file

I know this is a simple question, but I am extremely stuck.
file=open("record.txt","w+")
record = file.read()
print("The record is "+str(record)+"!!")
main code...
file.write(str(reaction))
file.close()
I have got his code and I've got a number of 0.433534145355 in the file, but when I do the command of print the +str(record)+, it only comes up with The record is !! and the number is not there. What is wrong with this code. Is there a special code with decimal places, and I do not want to use int().
As it says here:
'w+' Open for reading and writing. The file is created if it does not
exist, otherwise it is truncated. The stream is positioned at
the beginning of the file.
so yes, your file is also opened for reading, but it is truncated (i.e. it is now zero bytes long, it's empty), leaving nothing left to read of what was there already.
Essentially, the w in 'w+' means the mode is orientated to writing, giving you the option to read as well (useful in those cases when you need to seek back and read what you have written. There will be nothing to read unless you write)
Instead you can use:
'r+' Open for reading and writing. The stream is positioned at the
beginning of the file.
In this case, the r in 'r+' signifies the mode is orientated to reading, giving you the option to seek and write where necessary (useful when data is present already, but might need to be changed)
If you want to read from a file, you have to open it for reading too (r).

Why can't I call read() twice on an open file?

For an exercise I'm doing, I'm trying to read the contents of a given file twice using the read() method. Strangely, when I call it the second time, it doesn't seem to return the file content as a string?
Here's the code
f = f.open()
# get the year
match = re.search(r'Popularity in (\d+)', f.read())
if match:
print match.group(1)
# get all the names
matches = re.findall(r'<td>(\d+)</td><td>(\w+)</td><td>(\w+)</td>', f.read())
if matches:
# matches is always None
Of course I know that this is not the most efficient or best way, this is not the point here. The point is, why can't I call read() twice? Do I have to reset the file handle? Or close / reopen the file in order to do that?
Calling read() reads through the entire file and leaves the read cursor at the end of the file (with nothing more to read). If you are looking to read a certain number of lines at a time you could use readline(), readlines() or iterate through lines with for line in handle:.
To answer your question directly, once a file has been read, with read() you can use seek(0) to return the read cursor to the start of the file (docs are here). If you know the file isn't going to be too large, you can also save the read() output to a variable, using it in your findall expressions.
Ps. Don't forget to close the file after you are done with it.
As other answers suggested, you should use seek().
I'll just write an example:
>>> a = open('file.txt')
>>> a.read()
#output
>>> a.seek(0)
>>> a.read()
#same output
Everyone who has answered this question so far is absolutely right - read() moves through the file, so after you've called it, you can't call it again.
What I'll add is that in your particular case, you don't need to seek back to the start or reopen the file, you can just store the text that you've read in a local variable, and use it twice, or as many times as you like, in your program:
f = f.open()
text = f.read() # read the file into a local variable
# get the year
match = re.search(r'Popularity in (\d+)', text)
if match:
print match.group(1)
# get all the names
matches = re.findall(r'<td>(\d+)</td><td>(\w+)</td><td>(\w+)</td>', text)
if matches:
# matches will now not always be None
The read pointer moves to after the last read byte/character. Use the seek() method to rewind the read pointer to the beginning.
Every open file has an associated position.
When you read() you read from that position.
For example read(10) reads the first 10 bytes from a newly opened file, then another read(10) reads the next 10 bytes.
read() without arguments reads all of the contents of the file, leaving the file position at the end of the file. Next time you call read() there is nothing to read.
You can use seek to move the file position. Or probably better in your case would be to do one read() and keep the result for both searches.
read() consumes. So, you could reset the file, or seek to the start before re-reading. Or, if it suites your task, you can use read(n) to consume only n bytes.
I always find the read method something of a walk down a dark alley. You go down a bit and stop but if you are not counting your steps you are not sure how far along you are. Seek gives the solution by repositioning, the other option is Tell which returns the position along the file. May be the Python file api can combine read and seek into a read_from(position,bytes) to make it simpler - till that happens you should read this page.

Categories