I am trying to read the following file line by line and check if a value exists in the file. What I am trying currently is not working. What am I doing wrong?
If the value exists I do nothing. If it does not then I write it to the file.
file.txt:
123
345
234
556
654
654
Code:
file = open("file.txt", "a+")
lines = file.readlines()
value = '345'
if value in lines:
print('val ready exists in file')
else:
# write to file
file.write(value)
There are two problems here:
.readlines() returns lines with \n not trimmed, so your check will not work properly.
a+ mode opens a file with position set to the end of the file. So your readlines() currently returns an empty list!
Here is a direct fixed version of your code, also adding context manager to auto-close the file
value = '345'
with open("file.txt", "a+") as file:
file.seek(0) # set position to start of file
lines = file.read().splitlines() # now we won't have those newlines
if value in lines:
print('val ready exists in file')
else:
# write to file
file.write(value + "\n") # in append mode writes will always go to the end, so no need to seek() here
However, I agree with #RoadRunner that better is to just use r+ mode; then you don't need the seek(0). But the cleanest is just to split out your read and write phases completely, so you don't run into file position problems.
I would consider several changes.
1: Use with to automatically close the file.
2: Use strip() to remove leading or trailing stuff, like \n
3: Use a break for the loop.
4: Add \n in the write part.
value = "345"
with open("file.txt", "a+") as file:
file.seek(0)
for line in file.readlines():
if line.strip("\n") == value:
print('val ready exists in file')
break
else:
# write to file
file.write(f"\n{value}")
when working with io the recomended approach is to use the context manager. Context managers allow you to allocate and release resources precisely when you want to. The most widely used example of context managers is the with statement. if you have a large file better not to use file.readlines() or the read() method. The readlines() method returns a list containing each line in the file as a list item. better to iterate on the file stream line by line (generator). always use try except with io operations! :
values=['123','233'...]
try:
with open("file.txt", "r+") as fp:
for line in fp:
for val in values:
if val not in line.strip():
fp.write(val)
else:
print('val ready exists in file')
except (OSError,...): #catch what ever you think this code above can raise, and re raise in except block if you want.
#do exception handling
Since you want to open the file for reading and writing, I suggest using the r+ mode from open(). This will open the file at the beginning of the file, since we want to first read all lines. Using a+ will open the file for reading and writing at the end of the file, which will cause lines to give you an empty list from readlines().
You also need to strip newlines from lines before checking if the value exists. This is because 345 is not equal to 345/n. We can use a list comprehension to strip the newlines from lines using str.rstrip(), which strips whitespace from the right. Additionally, If you have to do repetitive lookups for multiple values, it might be worth converting lines to a set for constant time lookups, instead of doing a linear search with a list.
Its also worth using With Statement Context Managers when reading files, since the closing of the file is handled for you.
value = '345'
with open("file.txt", mode="r+") as file:
lines = [line.rstrip() for line in file.readlines()]
if value in lines:
print('value ready exists in file')
else:
file.write(f"{value}\n")
The other choice is to use f.seek(0) with a+ to set the position at the beginning of the file, as shown in #Cihan Ceyhan's answer. However I think this overcomplicates things, and its just easier to use the r+ mode.
This should work :)
filename = 'file.txt'
value = '345'
with open(filename) as f:
if value in f.read(): # read if value is in file
print('Value is in the file')
This will check if the value is in the file and if it's not there. It will add value to the file.
filename = 'file_1.txt'
value = '999'
with open(filename, 'r+') as f:
if value in f.read():
print(f"Value {value} is in the file")
else:
print("The value not in the file.\nTherefore, saving the value in the file.")
f.write(f"{value}\n")
When you are opening file in "a+" mode the file cursor from where the readlines() method will start reading will be at the end so readlines would read nothing. You need to do f.seek(0) in order to move the cursor to the beginning.
file = open("file.txt", "a+")
file.seek(0)
lines = [line.strip() for line in file.readlines()]
print(lines)
value = '345'
if value in lines:
print('val ready exists in file')
else:
print("else")
# write to file
file.write(value)
file.close()
Your python script works for me properly. I guess the last line in the else statement should be file.write(value) instead of file.write(val).
The readlines() method returns a list containing \n with each element so when the condition is checked it is not compared with value with \n so the statement is always false so i have a code that solve your problem.
f_name = "file.txt"
text = "102"
def check_in_file(file_name, value):
with open(file_name, 'r') as read_obj:
for line in read_obj:
if value in line:
return True
return False
if check_in_file(f_name, text):
print('Yes, string found in file')
else:
print('String not found in file')
file = open(f_name, 'a')
file.write("\n"+text)
file.close()
Related
I am trying to delete all the lines in a text file after a line that contains a specific string. What I am trying to do is find the number of the line in said file and rewrite the whole text up until that line.
The code that I'm trying is the following:
import itertools as it
with open('sampletext.txt', "r") as rf:
for num, line in enumerate(rf, 1): #Finds the number of the line in which a specific string is contained
if 'string' in line:
print(num)
with open('sampletext_copy.txt', "w") as wf:
for line in it.islice(rf, 0, num):
wf.write(line)
Also would appreciate any tips on how to do this. Thank you!
You could do it like this:
with open('sampletext.txt', "r") as rf, open('sampletext_copy.txt', "w") as wf:
for line in rf:
if 'string' in line:
break
wf.write(line)
Basically, you open both files at the same time, then read the input file line-by-line. If string is in the line, then you're done - otherwise, write it to the output file.
In case if you want to apply changes to original file, it's possible to do using .truncate() method of file object:
with open(r"sampletext.txt", "r+") as f:
while line := f.readline():
if line.rstrip() == "string": # line.startswith("string")
f.truncate(f.tell()) # removes all content after current position
break
Here we iterating over file until reach this specific line and resize stream to size of bytes we've already read (to get it we use .tell()).
Just to complement Donut's answer, if you want to modify the file in place, there's a much more efficient solution:
with open('sampletext.txt', "r+") as f:
for line in iter(f.readline, ''): # Can't use for line in f: because it disables
# tell for txt
# Or for walrus lovers:
# while line := f.readline():
if 'string' in line:
f.seek(0, 1) # Needed to ensure underlying handle matches logical read
# position; f.seek(f.tell()) is logically equivalent
f.truncate()
break
If issue #26158 is ever fixed (so calling truncate on a file actually truncates at the logical position, not the arbitrary position of the underlying raw handle that's likely advanced a great deal due to buffering), this simpler code would work:
with open('sampletext.txt', "r+") as f:
for line in f:
if 'string' in line:
f.truncate()
break
I'm working on a script to parse text files into a spreadsheet for myself, and in doing so I need to read through them. The issue is finding out when to stop. Java has a method attached when reading called hasNext() or hasNextLine() I was wondering if there was something like that in Python? For some reason I can't find this anywhere.
Ex:
open(f) as file:
file.readline()
nextLine = true
while nextLine:
file.readline()
Do stuff
if not file.hasNextLine():
nextLine = false
Just use a for loop to iterate over the file object:
for line in file:
#do stuff..
Note that this includes the new line char (\n) at the end of each line string. This can be removed through either:
for line in file:
line = line[:-1]
#do stuff...
or:
for line in (l[:-1] for l in file):
#do stuff...
You can only check if the file has another line by reading it (although you can check if you are at the end of the file with file.tell without any reading).
This can be done through calling file.readline and checking if the string is not empty or timgeb's method of calling next and catching the StopIteration exception.
So to answer your question exactly, you can check whether a file has another line through:
next_line = file.readline():
if next_line:
#has next line, do whatever...
or, without modifying the current file pointer:
def has_another_line(file):
cur_pos = file.tell()
does_it = bool(file.readline())
file.seek(cur_pos)
return does_it
which resets the file pointer resetting the file object back to its original state.
e.g.
$ printf "hello\nthere\nwhat\nis\nup\n" > f.txt
$ python -q
>>> f = open('f.txt')
>>> def has_another_line(file):
... cur_pos = file.tell()
... does_it = bool(file.readline())
... file.seek(cur_pos)
... return does_it
...
>>> has_another_line(f)
True
>>> f.readline()
'hello\n'
The typical cadence that I use for reading text files is this:
with open('myfile.txt', 'r') as myfile:
lines = myfile.readlines()
for line in lines:
if 'this' in line: #Your criteria here to skip lines
continue
#Do something here
Using with will only keep the file open until you have executed all of the code within it's block, then the file will be closed. I also think it's valuable to highlight the readlines() method here, which reads all lines in the file and stores them in a list. In terms of handling newline (\n) characters, I would point you to #Joe Iddon's answer.
Python doesn't have an end-of-file (EOF) indicator, but you could get the same effect this way:
with open(f) as file:
file.seek(0, 2) # go to end of file
eof = file.tell() # get end-of-file position
file.seek(0, 0) # go back to start of file
file.readline()
nextLine = True # maybe nextLine = (file.tell() != eof)
while nextLine:
file.readline()
# Do stuff
if file.tell() == eof:
nextLine = False
But as others have pointed out, you may do better by treating the file as an iterable, like this:
with open(f) as file:
next_line = next(file)
# next loop will terminate when next_line is '',
# i.e., after failing to read another line at end of file
while next_line:
# Do stuff
next_line = next(file)
Files are iterators over lines. If all you want to do is check whether a file has a line left, you can issue line = next(file) and catch the StopIeration raised in case there isn't another line. Alternatively you can use line = next(file, default) with a non-string default value (e.g. None) and then check against that.
Note that in most cases, you know that you are done when the for loop over the file ends, as the other answers have explained. So make sure you actually need that kind of fine grained control with next.
with open(filepath, 'rt+') as f:
for line in f.readlines():
#code to process each line
Opening it this way also closes it when it's finished which is much better on the overall memory usage, which might not matter depending on the file size.
The first lines is comparable to:
f = open(....)
f.readlines() gives you a list of all lines in the file.
The loop will start at the first line and end at then last line and shouldn't throw any errors regarding EOF for example.
[Edit]
notice the 'rt+' in the open method. As far as I'm aware this opens the file in read text mode. I.e. no decode required.
I am writing in python 3.6 and am having trouble making my code match strings in a short text document. this is a simple example of the exact logic that is breaking my bigger program:
PATH = "C:\\Users\\JoshLaptop\\PycharmProjects\\practice\\commented.txt"
file = open(PATH, 'r')
words = ['bah', 'dah', 'gah', "fah", 'mah']
print(file.read().splitlines())
if 'bah' not in file.read().splitlines():
print("fail")
with the text document formatted like so:
bah
gah
fah
dah
mah
and it is indeed printing out fail each time I run this. Am I using the incorrect method of reading the data from the text document?
the issue is that you're printing print(file.read().splitlines())
so it exhausts the file, and the next call to file.read().splitlines() returns an empty list...
A better way to "grep" your pattern would be to iterate on the file lines instead of reading it fully. So if you find the string early in the file, you save time:
with open(PATH, 'r') as f:
for line in f:
if line.rstrip()=="bah":
break
else:
# else is reached when no break is called from the for loop: fail
print("fail")
The small catch here is not to forget to call line.rstrip() because file generator issues the line with the line terminator. Also, if there's a trailing space in your file, this code will still match the word (make it strip() if you want to match even with leading blanks)
If you want to match a lot of words, consider creating a set of lines:
lines = {line.rstrip() for line in f}
so your in lines call will be a lot faster.
Try it:
PATH = "C:\\Users\\JoshLaptop\\PycharmProjects\\practice\\commented.txt"
file = open(PATH, 'r')
words = file.read().splitlines()
print(words)
if 'bah' not in words:
print("fail")
You can't read the file two times.
When you do print(file.read().splitlines()), the file is read and the next call to this function will return nothing because you are already at the end of file.
PATH = "your_file"
file = open(PATH, 'r')
words = ['bah', 'dah', 'gah', "fah", 'mah']
if 'bah' not in (file.read().splitlines()) :
print("fail")
as you can see output is not 'fail' you must use one 'file.read().splitlines()' in code or save it in another variable otherwise you have an 'fail' message
I am trying to append a string to a file, if the string doesn't exit in the file. However, opening a file with a+ option doesn't allow me to do at once, because opening the file with a+ will put the pointer to the end of the file, meaning that my search will always fail. Is there any good way to do this other than opening the file to read first, close and open again to append?
In code, apparently, below doesn't work.
file = open("fileName", "a+")
I need to do following to achieve it.
file = open("fileName", "r")
... check if a string exist in the file
file.close()
... if the string doesn't exist in the file
file = open("fileName", "a")
file.write("a string")
file.close()
To leave the input file unchanged if needle is on any line or to append the needle at the end of the file if it is missing:
with open("filename", "r+") as file:
for line in file:
if needle in line:
break
else: # not found, we are at the eof
file.write(needle) # append missing data
I've tested it and it works on both Python 2 (stdio-based I/O) and Python 3 (POSIX read/write-based I/O).
The code uses obscure else after a loop Python syntax. See Why does python use 'else' after for and while loops?
You can set the current position of the file object using file.seek(). To jump to the beginning of a file, use
f.seek(0, os.SEEK_SET)
To jump to a file's end, use
f.seek(0, os.SEEK_END)
In your case, to check if a file contains something, and then maybe append append to the file, I'd do something like this:
import os
with open("file.txt", "r+") as f:
line_found = any("foo" in line for line in f)
if not line_found:
f.seek(0, os.SEEK_END)
f.write("yay, a new line!\n")
There is a minor bug in the previous answers: often, the last line in a text file is missing an ending newline. If you do not take that that into account and blindly append some text, your text will be appended to the last line.
For safety:
needle = "Add this line if missing"
with open("filename", "r+") as file:
ends_with_newline = True
for line in file:
ends_with_newline = line.endswith("\n")
if line.rstrip("\n\r") == needle:
break
else: # not found, we are at the eof
if not ends_with_newline:
file.write("\n")
file.write(needle + "\n") # append missing data
As a practice, I am learning to reading a file.
As is obvious from code, hopefully, I have a file in working/root whatever directory. I need to read it and print it.
my_file=open("new.txt","r")
lengt=sum(1 for line in my_file)
for i in range(0,lengt-1):
myline=my_file.readlines(1)[0]
print(myline)
my_file.close()
This returns error and says out of range.
The text file simply contains statements like
line one
line two
line three
.
.
.
Everything same, I tried myline=my_file.readline(). I get empty 7 lines.
My guess is that while using for line in my_file, I read up the lines. So reached end of document. To get same result as I desire, I do I overcome this?
P.S. if it mattersm it's python 3.3
No need to count along. Python does it for you:
my_file = open("new.txt","r")
for myline in my_file:
print(myline)
Details:
my_file is an iterator. This a special object that allows to iterate over it.
You can also access a single line:
line 1 = next(my_file)
gives you the first line assuming you just opened the file. Doing it again:
line 2 = next(my_file)
you get the second line. If you now iterate over it:
for myline in my_file:
# do something
it will start at line 3.
Stange extra lines?
print(myline)
will likely print an extra empty line. This is due to a newline read from the file and a newline added by print(). Solution:
Python 3:
print(myline, end='')
Python 2:
print myline, # note the trailing comma.
Playing it save
Using the with statement like this:
with open("new.txt", "r") as my_file:
for myline in my_file:
print(myline)
# my_file is open here
# my_file is closed here
you don't need to close the file as it done as soon you leave the context, i.e. as soon as you continue with your code an the same level as the with statement.
You can actually take care of all of this at once by iterating over the file contents:
my_file = open("new.txt", "r")
length = 0
for line in my_file:
length += 1
print(line)
my_file.close()
At the end, you will have printed all of the lines, and length will contain the number of lines in the file. (If you don't specifically need to know length, there's really no need for it!)
Another way to do it, which will close the file for you (and, in fact, will even close the file if an exception is raised):
length = 0
with open("new.txt", "r") as my_file:
for line in my_file:
length += 1
print(line)