I am trying out regex operations in Python. However, I am not able to read the file again once I use it for the same.
f = codecs.open(filename, 'rU', 'utf-8')
#print f.read() works here
#printing the year
year = re.search(r'Popularity in (\w+)',f.read())
print year.group(1)
#now, this returns nothing !
print f.read()
I am not able to understand what I am doing wrong here.
When calling f.read(), the file object will step over all lines and as file objects are generators, it will remember where it stopped reading. If you continue reading with calling f.read() again, the file object will continue reading where it left, i.e. at the end of the file. By calling f.seek(0) you will reset the position in the file and you can read the file again. In your case it may make more sense to save the content of the file in variable, which can be accessed multiple times.
file_content = f.read()
year = re.search(r'Popularity in (\w+)', file_content)
print year.group(1)
print file_content
or
year = re.search(r'Popularity in (\w+)', f.read())
print year.group(1)
f.seek(0) # reset the file read position
print f.read()
I would choose the first option.
Add f.seek(0) before the second read. Once the file was readed completely, the pointer comes to the file end. Now you have to move the pointer up(ie, file start) . In-order to do this, we have to add fileobject.seek(0)
Related
I am trying to read the following file line by line and check if a value exists in the file. What I am trying currently is not working. What am I doing wrong?
If the value exists I do nothing. If it does not then I write it to the file.
file.txt:
123
345
234
556
654
654
Code:
file = open("file.txt", "a+")
lines = file.readlines()
value = '345'
if value in lines:
print('val ready exists in file')
else:
# write to file
file.write(value)
There are two problems here:
.readlines() returns lines with \n not trimmed, so your check will not work properly.
a+ mode opens a file with position set to the end of the file. So your readlines() currently returns an empty list!
Here is a direct fixed version of your code, also adding context manager to auto-close the file
value = '345'
with open("file.txt", "a+") as file:
file.seek(0) # set position to start of file
lines = file.read().splitlines() # now we won't have those newlines
if value in lines:
print('val ready exists in file')
else:
# write to file
file.write(value + "\n") # in append mode writes will always go to the end, so no need to seek() here
However, I agree with #RoadRunner that better is to just use r+ mode; then you don't need the seek(0). But the cleanest is just to split out your read and write phases completely, so you don't run into file position problems.
I would consider several changes.
1: Use with to automatically close the file.
2: Use strip() to remove leading or trailing stuff, like \n
3: Use a break for the loop.
4: Add \n in the write part.
value = "345"
with open("file.txt", "a+") as file:
file.seek(0)
for line in file.readlines():
if line.strip("\n") == value:
print('val ready exists in file')
break
else:
# write to file
file.write(f"\n{value}")
when working with io the recomended approach is to use the context manager. Context managers allow you to allocate and release resources precisely when you want to. The most widely used example of context managers is the with statement. if you have a large file better not to use file.readlines() or the read() method. The readlines() method returns a list containing each line in the file as a list item. better to iterate on the file stream line by line (generator). always use try except with io operations! :
values=['123','233'...]
try:
with open("file.txt", "r+") as fp:
for line in fp:
for val in values:
if val not in line.strip():
fp.write(val)
else:
print('val ready exists in file')
except (OSError,...): #catch what ever you think this code above can raise, and re raise in except block if you want.
#do exception handling
Since you want to open the file for reading and writing, I suggest using the r+ mode from open(). This will open the file at the beginning of the file, since we want to first read all lines. Using a+ will open the file for reading and writing at the end of the file, which will cause lines to give you an empty list from readlines().
You also need to strip newlines from lines before checking if the value exists. This is because 345 is not equal to 345/n. We can use a list comprehension to strip the newlines from lines using str.rstrip(), which strips whitespace from the right. Additionally, If you have to do repetitive lookups for multiple values, it might be worth converting lines to a set for constant time lookups, instead of doing a linear search with a list.
Its also worth using With Statement Context Managers when reading files, since the closing of the file is handled for you.
value = '345'
with open("file.txt", mode="r+") as file:
lines = [line.rstrip() for line in file.readlines()]
if value in lines:
print('value ready exists in file')
else:
file.write(f"{value}\n")
The other choice is to use f.seek(0) with a+ to set the position at the beginning of the file, as shown in #Cihan Ceyhan's answer. However I think this overcomplicates things, and its just easier to use the r+ mode.
This should work :)
filename = 'file.txt'
value = '345'
with open(filename) as f:
if value in f.read(): # read if value is in file
print('Value is in the file')
This will check if the value is in the file and if it's not there. It will add value to the file.
filename = 'file_1.txt'
value = '999'
with open(filename, 'r+') as f:
if value in f.read():
print(f"Value {value} is in the file")
else:
print("The value not in the file.\nTherefore, saving the value in the file.")
f.write(f"{value}\n")
When you are opening file in "a+" mode the file cursor from where the readlines() method will start reading will be at the end so readlines would read nothing. You need to do f.seek(0) in order to move the cursor to the beginning.
file = open("file.txt", "a+")
file.seek(0)
lines = [line.strip() for line in file.readlines()]
print(lines)
value = '345'
if value in lines:
print('val ready exists in file')
else:
print("else")
# write to file
file.write(value)
file.close()
Your python script works for me properly. I guess the last line in the else statement should be file.write(value) instead of file.write(val).
The readlines() method returns a list containing \n with each element so when the condition is checked it is not compared with value with \n so the statement is always false so i have a code that solve your problem.
f_name = "file.txt"
text = "102"
def check_in_file(file_name, value):
with open(file_name, 'r') as read_obj:
for line in read_obj:
if value in line:
return True
return False
if check_in_file(f_name, text):
print('Yes, string found in file')
else:
print('String not found in file')
file = open(f_name, 'a')
file.write("\n"+text)
file.close()
I'm trying to read from an originally empty file, after a write, before closing it. Is this possible in Python?
with open("outfile1.txt", 'r+') as f:
f.write("foobar")
f.flush()
print("File contents:", f.read())
Flushing with f.flush() doesn't seem to work, as the final f.read() still returns nothing.
Is there any way to read the "foobar" from the file besides re-opening it?
You need to reset the file object's index to the first position, using seek():
with open("outfile1.txt", 'r+') as f:
f.write("foobar")
f.flush()
# "reset" fd to the beginning of the file
f.seek(0)
print("File contents:", f.read())
which will make the file available for reading from it.
File objects keep track of current position in the file. You can get it with f.tell() and set it with f.seek(position).
To start reading from the beginning again, you have to set the position to the beginning with f.seek(0).
http://docs.python.org/2/library/stdtypes.html#file.seek
Seek back to the start of the file before reading:
f.seek(0)
print f.read()
I am learning python file operations and was experimenting with different options to read and write.
As far as I know this code should be able to both append and read from test.txt file as I have opened it with "a+". But though the append operation is working as expected, I am not getting any output from the print function.
my_file = open('test.txt', 'a+')
my_file.write("You know nothin' Jon Snow.")
content = my_file.read()
print(content)
my_file.close()
What I'm doing wrong here?
When you first open the file, the file pointer is at the end of the file. The write leaves the file pointer following the new text. When you try to read, there's nothing left to read; you are already at the end of the file. If you wanted to read the entire contents of the file, you would need to seek to the beginning before reading.
with open('test.txt', 'a+') as my_file:
my_file.write("You know nothin' Jon Snow.")
my_file.seek(0)
content = my_file.read()
print(content)
Because after you do the write you are now positioned at the end of the file so when you do a read operation there is nothing to read. You need to do first do a seek to position yourself somewhere before the end of file:
my_file = open('test.txt', 'a+')
my_file.write("You know nothin' Jon Snow.")
my_file.seek(0)
content = my_file.read()
print(content)
my_file.close()
I am trying to append a string to a file, if the string doesn't exit in the file. However, opening a file with a+ option doesn't allow me to do at once, because opening the file with a+ will put the pointer to the end of the file, meaning that my search will always fail. Is there any good way to do this other than opening the file to read first, close and open again to append?
In code, apparently, below doesn't work.
file = open("fileName", "a+")
I need to do following to achieve it.
file = open("fileName", "r")
... check if a string exist in the file
file.close()
... if the string doesn't exist in the file
file = open("fileName", "a")
file.write("a string")
file.close()
To leave the input file unchanged if needle is on any line or to append the needle at the end of the file if it is missing:
with open("filename", "r+") as file:
for line in file:
if needle in line:
break
else: # not found, we are at the eof
file.write(needle) # append missing data
I've tested it and it works on both Python 2 (stdio-based I/O) and Python 3 (POSIX read/write-based I/O).
The code uses obscure else after a loop Python syntax. See Why does python use 'else' after for and while loops?
You can set the current position of the file object using file.seek(). To jump to the beginning of a file, use
f.seek(0, os.SEEK_SET)
To jump to a file's end, use
f.seek(0, os.SEEK_END)
In your case, to check if a file contains something, and then maybe append append to the file, I'd do something like this:
import os
with open("file.txt", "r+") as f:
line_found = any("foo" in line for line in f)
if not line_found:
f.seek(0, os.SEEK_END)
f.write("yay, a new line!\n")
There is a minor bug in the previous answers: often, the last line in a text file is missing an ending newline. If you do not take that that into account and blindly append some text, your text will be appended to the last line.
For safety:
needle = "Add this line if missing"
with open("filename", "r+") as file:
ends_with_newline = True
for line in file:
ends_with_newline = line.endswith("\n")
if line.rstrip("\n\r") == needle:
break
else: # not found, we are at the eof
if not ends_with_newline:
file.write("\n")
file.write(needle + "\n") # append missing data
Edited my program - still having same issue
Also, the linked answer that was recommened is useless as it only tells you that you cannot modify a file in place and does not offer any good solution.
I have a file that has line numbers at the start of it. I wrote a python script to eliminate these line numbers. This is my second attempt at it and I am still having the same issues
First I open the file and save it to a variable to reuse later:
#Open for reading and save the file information to text
fin = open('test.txt','r')
text = fin.read()
fin.close
#Make modifications and write to new file
fout = open('test_new.txt','w')
for line in text:
whitespaceloc = line.find(' ')
newline = line[whitespaceloc:]
fout.write(newline)
fout.close()
I have also tried using the 'with' keyword with no luck,
When I open test_new.txt it is empty
What is going on here?
My advice on how to do this would be:
1) Read the file to a buffer:
with open('file.txt','r') as myfile:
lines=myfile.readlines()
2) Now close and overwrite the same file with any changes you want to do just as you did before:
with open('file.txt','w') as myfile:
for line in lines:
whitespaceloc = line.find(' ')
newline = line[whitespaceloc:]
myfile.write("%s" %newline)