How to print lines with a certain length in a file(Python) - python

I am new to python. I have a document that has one random word per line. There are thousands of words in this file. I am trying to print only the words that are four letters long. I tried this:
f=open("filename.txt")
Words=f.readlines()
for line in f:
if len(line)==4:
print(line)
f.close()
But python is blank when I do this. I am assuming I need to strip the blank spaces as well, but when I did
f.strip()
I received an error stating that .strip() doesn't apply to list items. Any help is grateful. Thanks!

'Python is blank' because you attempt to iterate over the file for a second time.
The first time is with readlines(), so when that iteration is finished you are at the end of the file. Then when you do for line in f you are already at the end of the file so there is nothing left over which to iterate. To fix this, drop the call to readlines().
To do what you want to have, I would just do this:
with open('filename.txt') as f:
for line in f: # No need for `readlines()`
word = line.strip() # Strip the line, not the file object.
if len(word) == 4:
print(word)
Your other error occurs with f.strip() because f is a file object- but you only strip a string. Therefore just split the line on each iteration as shown in the example above.

You should do:
for line in Words:
instead of
for line in f:

You want line.strip() because f is a file object, not a string.

Related

Checking if string is in text file is not working

I am writing in python 3.6 and am having trouble making my code match strings in a short text document. this is a simple example of the exact logic that is breaking my bigger program:
PATH = "C:\\Users\\JoshLaptop\\PycharmProjects\\practice\\commented.txt"
file = open(PATH, 'r')
words = ['bah', 'dah', 'gah', "fah", 'mah']
print(file.read().splitlines())
if 'bah' not in file.read().splitlines():
print("fail")
with the text document formatted like so:
bah
gah
fah
dah
mah
and it is indeed printing out fail each time I run this. Am I using the incorrect method of reading the data from the text document?
the issue is that you're printing print(file.read().splitlines())
so it exhausts the file, and the next call to file.read().splitlines() returns an empty list...
A better way to "grep" your pattern would be to iterate on the file lines instead of reading it fully. So if you find the string early in the file, you save time:
with open(PATH, 'r') as f:
for line in f:
if line.rstrip()=="bah":
break
else:
# else is reached when no break is called from the for loop: fail
print("fail")
The small catch here is not to forget to call line.rstrip() because file generator issues the line with the line terminator. Also, if there's a trailing space in your file, this code will still match the word (make it strip() if you want to match even with leading blanks)
If you want to match a lot of words, consider creating a set of lines:
lines = {line.rstrip() for line in f}
so your in lines call will be a lot faster.
Try it:
PATH = "C:\\Users\\JoshLaptop\\PycharmProjects\\practice\\commented.txt"
file = open(PATH, 'r')
words = file.read().splitlines()
print(words)
if 'bah' not in words:
print("fail")
You can't read the file two times.
When you do print(file.read().splitlines()), the file is read and the next call to this function will return nothing because you are already at the end of file.
PATH = "your_file"
file = open(PATH, 'r')
words = ['bah', 'dah', 'gah', "fah", 'mah']
if 'bah' not in (file.read().splitlines()) :
print("fail")
as you can see output is not 'fail' you must use one 'file.read().splitlines()' in code or save it in another variable otherwise you have an 'fail' message

skip a line when it has a # character in python?

I would like some help about a problem that I'm facing as a new python programmer. I did a .txt file in c++ where there are some lines starting with # character which mean a comment and I want to skip those lines when I'm reading the file in my python script. How can I do that?
I think this should help you.
I'll read the whole file and save all lines into a list.
Then I'll iterate over this list looking for the first character in every line.
If the first char is equal to "#", go to the next line.
Otherwise, append this line to a new list called selected_lines.
My code isn't super effective, one-liner or etc... but I think this may help you.
lines = []
selected_lines = []
filepath = "/usr//home/Desktop/myfile.txt"
with open(filepath, "r") as f:
lines.append(f.readlines())
for line in lines:
if line[0:1] == "#":
continue
else:
selected_lines.append(line)
Something like this would work if it's just the beginning character. If you need it to ignore comments after code, you would need to modify it to if '#' in line: and handle it accordingly.
with open('somefile.txt', 'r') as f:
for line in f:
# Use continue so your code doesn't become a nested mess.
# if this check passes, we can assume line is not a comment.
if line[0] == '#':
continue
# Do stuff with line after checking for the comment.

Reading a file in Python won't read the first line

I am reading a text file, separating the word and the number with the comma then adding them into separate lists however, the first name gets omitted. Here is my code.
for line in keywordFile:
line = keywordFile.readline()
keyword.append(line[0])
keywordValue.append(line[1])
You're jumping ahead with the first readline() and just use line defined in the for statement.
It seems that you keywordFile is a file object and since file objects are iterator (one shot iterables) after the first line that you loop over it you consume the first line.
for line in keywordFile:
^
And then you are using readline to read the next line which is extra here, so for getting ride of this problem you need to remove this part.
Also as a more pythonic way you can use a list comprehension to create the list of words by splitting the lines with comma.If you want to create a list of all words you can use a nested loop :
with open ('filename') as keywordFile :
words = [w for line in keywordFile for w in line.split(',')]
But if you want to put the separated words of each line in a separate list you just need to use a one loop :
with open ('filename') as keywordFile :
words = [line.split(',') for line in keywordFile]
Or as a better choice use csv module to open the file as a separated words.You can pass a delimiter argument to csv.reader function :
import csv
with open('file_name') as f:
words=csv.reader(f,delimiter=',')
Here words is a iterator from tuples of separated words. And of you want to concatenate them you can sue itertools.chain.from_iterable() function.
Try something like:
for line in keywordFile:
tokens = line.split(',')
keyword.append(tokens[0])
keywordValue.append(tokens[1])

How to find a word in a string in a list? (Python)

So im trying to find a way so I can read a txt file and find a specific word. I have been calling the file with
myfile=open('daily.txt','r')
r=myfile.readlines()
that would return a list with a string for each line in the file, i want to find a word in one of the strings inside the list.
edit:
Im sorry I meant if there was a way to find where the word is in the txt file, like x=myfile[12] x=x[2:6]
def findLines():
myWord = 'someWordIWantToSearchFor'
answer = []
with open('daily.txt') as myfile:
lines = myfile.readlines()
for line in lines:
if myWord in line:
answer.append(line)
return answer
with open('daily.txt') as myfile:
for line in myfile:
if "needle" in line:
print "found it:", line
With the above, you don't need to allocate memory for the entire file at once, only one line at a time. This will be much more efficient if your file is large. It also closes the file automatically at the end of the with.
I'm not sure if the suggested answers solve the problem or not, because I'm not sure what the original proposer means. If he really means "words," not "substrings" then the solutions don't work, because, for example,
'cat' in line
evaluates to True if line contains the word 'catastrophe.' I think you may want to amend these answers along the lines of
if word in line.split(): ...

Remove whitespaces in the beginning of every string in a file in python?

How to remove whitespaces in the beginning of every string in a file with python?
I have a file myfile.txt with the strings as shown below in it:
_ _ Amazon.inc
Arab emirates
_ Zynga
Anglo-Indian
Those underscores are spaces.
The code must be in a way that it must go through each and every line of a file and remove all those whitespaces, in the beginning of a line.
I've tried using lstrip but that's not working for multiple lines and readlines() too.
Using a for loop can make it better?
All you need to do is read the lines of the file one by one and remove the leading whitespace for each line. After that, you can join again the lines and you'll get back the original text without the whitespace:
with open('myfile.txt') as f:
line_lst = [line.lstrip() for line in f.readlines()]
lines = ''.join(line_lst)
print lines
Assuming that your input data is in infile.txt, and you want to write this file to output.txt, it is easiest to use a list comprehension:
inf = open("infile.txt")
stripped_lines = [l.lstrip() for l in inf.readlines()]
inf.close()
# write the new, stripped lines to a file
outf = open("output.txt", "w")
outf.write("".join(stripped_lines))
outf.close()
To read the lines from myfile.txt and write them to output.txt, use
with open("myfile.txt") as input:
with open("output.txt", "w") as output:
for line in input:
output.write(line.lstrip())
That will make sure that you close the files after you're done with them, and it'll make sure that you only keep a single line in memory at a time.
The above code works in Python 2.5 and later because of the with keyword. For Python 2.4 you can use
input = open("myfile.txt")
output = open("output.txt", "w")
for line in input:
output.write(line.lstrip())
if this is just a small script where the files will be closed automatically at the end. If this is part of a larger program, then you'll want to explicitly close the files like this:
input = open("myfile.txt")
try:
output = open("output.txt", "w")
try:
for line in input:
output.write(line.lstrip())
finally:
output.close()
finally:
input.close()
You say you already tried with lstrip and that it didn't work for multiple lines. The "trick" is to run lstrip on each individual line line I do above. You can try the code out online if you want.

Categories