Python - file lines - pallindrome - python

I have been doing python tasks for learning and I came across this task where I have to read a file that includes few words and if a line is palindrome (same when written backwards: lol > lol)
so I tried with this code but It doesn't print anything on the terminal:
with open("words.txt") as f:
for line in f:
if line == line[::-1]:
print line
But if I print like this, without an if condition, it prints the words:
with open("words.txt") as f:
for line in f:
print line
I wonder why It wont print the words that I've written in the file:
sefes
kurwa
rawuk
lol
bollob

This is because those lines contain "\n" on the end. "\n" means new line. Therefore none of those are palindromes according to python.
You can strip off the "\n" first by doing:
with open("words.txt") as f:
for line in f:
if line.strip() == line.strip()[::-1]:
print line

The last character of each line is a newline character ("\n"). You need to strip trailing newlines ("foo\n".strip()) before checking whether the line is a palindrome.

When you read a line from a file like this, you also get the newline character. So, e.g., you're seeing 'sefes\n', which when reversed is '\nsefes'. These two lines are of course not equal. One way to solve this is to use rstrip to remove these newlines:
with open("words.txt") as f:
for line in f:
line = line.rstrip()
if line == line[::-1]:
print line

Related

How to open a file in python, read the comments ("#"), find a word after the comments and select the word after it?

I have a function that loops through a file that Looks like this:
"#" XDI/1.0 XDAC/1.4 Athena/0.9.25
"#" Column.4: pre_edge
Content
That is to say that after the "#" there is a comment. My function aims to read each line and if it starts with a specific word, select what is after the ":"
For example if I had These two lines. I would like to read through them and if the line starts with "#" and contains the word "Column.4" the word "pre_edge" should be stored.
An example of my current approach follows:
with open(file, "r") as f:
for line in f:
if line.startswith ('#'):
word = line.split(" Column.4:")[1]
else:
print("n")
I think my Trouble is specifically after finding a line that starts with "#" how can I parse/search through it? and save its Content if it contains the desidered word.
In case that # comment contain str Column.4: as stated above, you could parse it this way.
with open(filepath) as f:
for line in f:
if line.startswith('#'):
# Here you proceed comment lines
if 'Column.4' in line:
first, remainder = line.split('Column.4: ')
# Remainder contains everything after '# Column.4: '
# So if you want to get first word ->
word = remainder.split()[0]
else:
# Here you can proceed lines that are not comments
pass
Note
Also it is a good practice to use for line in f: statement instead of f.readlines() (as mentioned in other answers), because this way you don't load all lines into memory, but proceed them one by one.
You should start by reading the file into a list and then work through that instead:
file = 'test.txt' #<- call file whatever you want
with open(file, "r") as f:
txt = f.readlines()
for line in txt:
if line.startswith ('"#"'):
word = line.split(" Column.4: ")
try:
print(word[1])
except IndexError:
print(word)
else:
print("n")
Output:
>>> ['"#" XDI/1.0 XDAC/1.4 Athena/0.9.25\n']
>>> pre_edge
Used a try and except catch because the first line also starts with "#" and we can't split that with your current logic.
Also, as a side note, in the question you have the file with lines starting as "#" with the quotation marks so the startswith() function was altered as such.
with open('stuff.txt', 'r+') as f:
data = f.readlines()
for line in data:
words = line.split()
if words and ('#' in words[0]) and ("Column.4:" in words):
print(words[-1])
# pre_edge

I am finding the next line after matching a line using startswith in python but i have multiple spaces between lines

Escape character is '^]'.
abc-2#terminal length 0
I am reading multiple files and This is the content i have in files and I am trying find the next line using "Escape character is '^]'." and Every file has different number of spaces in between 2 lines.
I am writing below code, but it is printing empty
with open(report_file_path, "r") as in_file:
for line in in_file:
abc="Escape character is '^]'."
if line.strip() == abc:
result= next(in_file)
print result
#Output should be : abc-2#terminal length 0
but I am getting empty
Use a while loop to check if the next line have any content.
Ex:
with open(filename2, "r") as in_file:
for line in in_file:
abc="Escape character is '^]'."
if line.strip()==abc:
while True:
result= next(in_file)
if result.strip():
break
print(result)

how to skip certain line in text file and keep reading the next line in python?

I have been searching for this answer but did not quite get it.
I have a text file that looks like this
who are you????
who are you man?
who are you!!!!
who are you? man or woman?
I want to skip the line with man in it and print
who are you????
who are you!!!!
My code so far
f = open("test.txt", "r")
word = "man"
for line in f:
if word in line:
f.next()
else:
print line
This prints the first line only
who are you????
How should I troubleshoot this problem?
Thank you for your help.
It's not necessary to add an if else statement in for loop, so you can modify your code in this way:
f = open("test.txt", "r")
word = "man"
for line in f:
if not word in line:
print line
Furthermore, the issue in your code is that you are using f.next() directly in a for loop used to scan the file. This is the reason because when the line contains "man" word, your code skips two lines.
If you want preserve if else statement because this is only an example of a more complex problem, you can use the following code:
f = open("test.txt", "r")
word = "man"
for line in f:
if word in line:
continue
else:
print line
Using continue, you skip one loop's iteration, and so you can reach your goal.
As Alex Fung suggests, would be better use with, so your code would become like this:
with open("test.txt", "r") as test_file:
for line in test_file:
if "man" not in line:
print line
Problem
With your current code, when the current line contains "man" :
you don't print anything. That's correct.
you also skip the next line. That's your problem!
f.next() is already called implicitely by for line in f: at each iteration. So you actually call f.next() twice when "man" is found.
If the last line of your file contains a "man", Python will throw an exception because there's no next line.
You might have been looking for continue, which would also achieve the desired result but would be complex and unneeded. Note that it's called next in Perl and Ruby, which might be confusing.
Example
who are you???? # <- This line gets printed, because there's no "man" in it
who are you man? # word in line is True. Don't print anything. And skip next line
who are you!!!! # Line is skipped because of f.next()
who are you? man or woman? # word in line is True. Don't print anything.
# Try to skip next line, but there's no next line anymore.
# The script raises an exception : StopIteration
Correct code
Don't forget to close the file. You can do this automatically with with :
word = "man"
with open("test.txt") as f:
for line in f:
if not word in line:
print line, # <- Note the comma to avoid double newlines
How about
f = open("test.txt", "r")
word = "man"
for line in f:
if not word in line:
print line

strings in file do not match to string in a set

I have a file with a word in each line and a set with words, and I want to put not equal words from set called 'out' to the file. There is part of my code:
def createNextU(self):
print "adding words to final file"
if not os.path.exists(self.finalFile):
open(self.finalFile, 'a').close
fin = open(self.finalFile,"r")
out = set()
for line in self.lines_seen: #lines_seen is a set with words
if line not in fin:
out.add(line)
else:
print line
fin.close()
fout= open(self.finalFile,"a+")
for line in out:
fout.write(line)
but it only match a bit of real equal words. I play with the same dictionary of words and it add repeat words to file each run. What I am doing wrong?? what happening?? I try to use '==' and 'is' comparators and I have the same result.
Edit 1: I am working with huge files(finalFile), which can't be full loaded at RAM, so I think I should read file line by line
Edit 2: Found big problem with pointer:
def createNextU(self):
print "adding words to final file"
if not os.path.exists(self.finalFile):
open(self.finalFile, 'a').close
out = set()
out.clear()
with open(self.finalFile,"r") as fin:
for word in self.lines_seen:
fin.seek(0, 0)'''with this line speed down to 40 lines/second,without it dont work'''
if word in fin:
self.totalmatches = self.totalmatches+1
else:
out.add(word)
self.totalLines=self.totalLines+1
fout= open(self.finalFile,"a+")
for line in out:
fout.write(line)
If I put the lines_seen bucle before opening the file, I open the file for each line in lines_seen, but speed ups to 30k lines/second only. With set() I am having 200k lines/second at worst, so I think I will load the file by parts and compare it using sets. Any better solution?
Edit 3: Done!
fin is a filehandle so you can't compare it with if line not in fin. The content needs to be read first.
with open(self.finalFile, "r") as fh:
fin = fh.read().splitlines() # fin is now a list of words from finalFile
for line in self.lines_seen: #lines_seen is a set with words
if line not in fin:
out.add(line)
else:
print line
# remove fin.close()
EDIT:
Since lines_seen is a set, try to create a new set with the words from finalFile then diff the sets?
file_set = set()
with open(self.finalFile, "r") as fh:
for f_line in fh:
new_set.add(f_line.strip())
# This will give you all the words in finalFile that are not in lines_seen.
print new_set.difference(self.lines_seen)
Your comparison is likely not working because the lines read from the file will have a newline at the end, so you are comparing 'word\n' to 'word'. Using 'rstrip' will help remove the trailing newlines:
>>> foo = 'hello\n'
>>> foo
'hello\n'
>>> foo.rstrip()
'hello'
I would also iterate over the file, rather than iterate over the variable containing the words you would like to check against. If I've understood your code, you would like to write anything that is in self.lines_seen to self.finalFile, if it is not already in it. If you use 'if line not in fin' as you have, this will not work as you're expecting. For example, if your file contains:
lineone
linetwo
linethree
and the set lines_seen, being unordered, returns 'linethree' and then 'linetwo', then the following will match 'linethree' but not 'linetwo' because the file object has already read past it:
with open(self.finalFile,"r" as fin:
for line in self.lines_seen:
if line not in fin:
print line
Instead, consider using a counter:
from collections import Counter
linecount = Counter()
# using 'with' means you don't have to worry about closing it once the block ends
with open(self.finalFile,"r") as fin:
for line in fin:
line = line.rstrip() # remove the right-most whitespace/newline
linecount[line] += 1
for word in self.lines_seen:
if word not in linecount:
out.add(word)

\n appending at the end of each line

I am writing lines one by one to an external files. Each line has 9 columns separated by Tab delimiter. If i split each line in that file and output last column, i can see \n being appended to the end of the 9 column. My code is:
#!/usr/bin/python
with open("temp", "r") as f:
for lines in f:
hashes = lines.split("\t")
print hashes[8]
The last column values are integers, either 1 or 2. When i run this program, the output i get is,
['1\n']
['2\n']
I should only get 1 or 2. Why is '\n' being appended here?
I tried the following check to remove the problem.
with open("temp", "r") as f:
for lines in f:
if lines != '\n':
hashes = lines.split("\t")
print hashes[8]
This too is not working. I tried if lines != ' '. How can i make this go away? Thanks in advance.
Try using strip on the lines to remove the \n (the new line character). strip removes the leading and trailing whitespace characters.
with open("temp", "r") as f:
for lines in f.readlines():
if lines.strip():
hashes = lines.split("\t")
print hashes[8]
\n is the newline character, it is how the computer knows to display the data on the next line. If you modify the last item in the array hashes[-1] to remove the last character, then that should be fine.
Depending on the platform, your line ending may be more than just one character. Dos/Windows uses "\r\n" for example.
def clean(file_handle):
for line in file_handle:
yield line.rstrip()
with open('temp', 'r') as f:
for line in clean(f):
hashes = line.split('\t')
print hashes[-1]
I prefer rstrip() for times when I want to preserve leading whitespace. That and using generator functions to clean up my input.
Because each line has 9 columns, the 8th index (which is the 9th object) has a line break, since the next line starts. Just take that away:
print hashes[8][:-1]

Categories