Remove comment lines from a file - python

I'm making a file type to store information from my program. The file type can include lines starting with #, like:
# This is a comment.
As shown, the # in front of a line denotes a comment.
I've written a program in Python that can read these files:
fileData = []
file = open("Tutorial.rdsf", "r")
line = file.readline()
while line != "":
fileData.append(line)
line = file.readline()
for item in list(fileData):
item.strip()
fileData = list(map(lambda s: s.strip(), fileData))
print(fileData)
As you can see, it takes the file, adds every line as an item in a list, and strips the items of \n. So far, so good.
But often these files contain comments I've made, and such the program adds them to the list.
Is there a way to delete all items in the list starting with #?
Edit: To make things a bit clearer: Comments won't be like this:
Some code:
{Some Code} #Foo
They'll be like this:
#Foo
Some code:
{Some Code}

You can process lines directly in a for loop:
with open("Tutorial.rdsf", "r") as file:
for line in file:
if line.startswith('#'):
continue # skip comments
line = line.strip()
# do more things with this line
Only put them into a list if you need random access (e.g. you need to access lines at specific indices).
I used a with statement to manage the open file, when Python reaches the end of the with block the file is automatically closed for you.

It's easy to check for leading # signs.
Change this:
while line != "":
fileData.append(line)
line = file.readline()
to this:
while line != "":
if not line.startswith("#"):
fileData.append(line)
line = file.readline()
But your program is a bit complicated for what it does. Look in the documentation where it explains about for line in file:.

Related

Insert new data for each text line in python

I have a text file that looks like this:
1,004,59
1,004,65
1,004,69
1,005,55
1,005,57
1,006,53
1,006,59
1,007,65
1,007,69
1,007,55
1,007,57
1,008,53
Want to create new text file that will be inserted by 'input', something like this
1,004,59,input
1,004,65,input
1,004,69,input
1,005,55,input
1,005,57,input
1,006,53,input
1,006,59,input
1,007,65,input
1,007,69,input
1,007,55,input
1,007,57,input
1,008,53,input
I have attempted something like this:
with open('data.txt', 'a') as f:
lines = f.readlines()
for i, line in enumerate(lines):
line[i] = line[i].strip() + 'input'
for line in lines:
f.writelines(line)
Not able to get the right approach though.
What you want is to be able to read and write to the file in place (at the same time). Python comes with the fileinput module which is good for this purpose:
import fileinput
for line in fileinput.input('data.txt', inplace=True):
line = line.rstrip()
print line + ",input"
Discusssion
The fileinput.input() function returns a generator that reads your file line by line. Each line ends up with a new line (either \n or \r\n, depends on the operating system).
The code then strip off each line of this new line, add the ",input" part, then print out. Note that because of fileinput magic, the print statement's output will go back into the file instead of the console.
There are a newline '\n' in every line in your file, so you should handle it.
edit: oh I forgot about the rstrip() function!
tmp = []
with open("input.txt", 'r') as file:
appendtext = ",input\n"
for line in file:
tmp.append(line.rstrip() + appendtext)
with open("input.txt", 'w') as file:
file.writelines(tmp)
Added:
Answer by Hai_Vu is great if you use fileinput since you don't have to open the file twice as I did.
To do only the thing you're asking I would go for something like
newLines = list()
with open('data.txt', 'r') as f:
lines = f.readlines()
for line in lines:
newLines.append(line.strip() + ',input\n')
with open('data2.txt', 'w') as f2:
f2.writelines(newLines)
But there are definitely more elegant solutions

using readline() in python to read a txt file but the second line is not read in aws comprehend api

I am reading a text file and passing it to the API, but then I am getting the result only for the first line in the file, the subsequent lines are not being read.
code below :
filename = 'c:\myfile.txt'
with open(filename) as f:
plain_text = f.readline()
response = client_comprehend.detect_entities(
Text=plain_text,
LanguageCode='en'
)
entites = list(set([x['Type'] for x in response['Entities']]))
print response
print entites
When you are doing with f.readline() it will only take the first line of the file. So if you want to go through each line of the file you have to loop through it. Otherwise if you want to read the entire file(not meant for big files) you can use f.read()
filename = 'c:\myfile.txt'
with open(filename) as f:
for plain_text in f:
response = client_comprehend.detect_entities(
Text=plain_text,
LanguageCode='en'
)
entites = list(set([x['Type'] for x in response['Entities']]))
print response
print entites
As csblo has pointed out in the comments, your readline is only reading the first line of the file because it's only being called once. readline is called once in your program as it is written, it performs the actions for the single line that has been read, and then the program closes without doing anything else.
Conveniently, file objects can be iterated over in a for loop like you would a list. Iterating over a file will return one line per iteration, as though you had called readline and assigned it to a value. Using this, your code will work when rewritten as such:
filename = 'c:\myfile.txt'
with open(filename) as f:
for plain_text_line in f:
response = client_comprehend.detect_entities(
Text=plain_text_line,
LanguageCode='en'
)
entites = list(set([x['Type'] for x in response['Entities']]))
print response
print entites
This should iterate over all lines of the file in turn.

how do you open .txt file in python in one line

I'm trying to open .txt file and am getting confused with which part goes where. I also want that when I open the text file in python, the spaces removed.And when answering could you make the file name 'clues'.
My first try is:
def clues():
file = open("clues.txt", "r+")
for line in file:
string = ("clues.txt")
print (string)
my second try is:
def clues():
f = open('clues.txt')
lines = [line.strip('\n') for line in open ('clues.txt')]
The thrid try is:
def clues():
f = open("clues.txt", "r")
print f.read()
f.close()
Building upon #JonKiparsky It would be safer for you to use the python with statement:
with open("clues.txt") as f:
f.read().replace(" ", "")
If you want to read the whole file with the spaces removed, f.read() is on the right track—unlike your other attempts, that gives you the whole file as a single string, not one line at a time. But you still need to replace the spaces. Which you need to do explicitly. For example:
f.read().replace(' ', '')
Or, if you want to replace all whitespace, not just spaces:
''.join(f.read().split())
This line:
f = open("clues.txt")
will open the file - that is, it returns a filehandle that you can read from
This line:
open("clues.txt").read().replace(" ", "")
will open the file and return its contents, with all spaces removed.

read numbers from text file python

I am trying to read a text file and return the contents of the text file. The textfile contains a matrix. When i run my code with the file it just prints the first line. My code looks right and i have searched online and cant seem to find the problem.
Code is:
def main():
matrix = "matrix1.txt"
print(readMatrix(matrix))
def readMatrix(matrix):
matrixFile = open(matrix, "r")
line = matrixFile.readline()
while line != "":
return line
line = matrixFile.readline()
matrixFile.close()
main()
while line != "":
return line # function ends
Maybe you mean
while line != "":
print line
return returns the value you pass it back to the caller and ends the function call. If you want to print each line, put the print statement instead of return.
You're misusing the return statement. When a function hits a return, control returns to the caller and does not return to the function. Thus, the most your function will do is read one line and return it, or close the file if the first line is empty.
Files in Python have a built-in iterator that will give you every line in the file, used like so:
with open(path) as f:
for line in f:
[do something]
Note the use of the with statement. It will automatically close the file when its block is exited, which makes it the preferred way to deal with reading/writing files.
So what you want to do could be something like
with open(path) as f:
for line in f:
if not line: # Equivalent to if line == ''
return
else: # This else is actually redundant, but here so the flow is clear
[do something]

Python wierd file name on create

I have a txt file with list of html/doc files, I want to download them using python and save them as 1.html, 2.doc, 3.doc, ...
http://example.com/kran.doc
http://example.com/loj.doc
http://example.com/sks.html
I've managed to create fully functional script except python will allways add question mark to the end of newly created file (if you look from linux) and if you look from windows file name would be something like 5CFB43~X
import urllib2
st = 1;
for line in open('links.txt', 'r'):
u = urllib2.urlopen(line)
ext = line.split(".")
imagefile = str(st)+"."+ext[-1]
#file created should be something.doc but its something.doc? -> notice question mark
fajl = open(imagefile, "w+")
fajl.write(u.read())
fajl.close()
print imagefile
st += 1
The line terminator is two characters, not one.
for line in open('links.txt', 'rU'):
But not anymore.
Work on line.strip() instead of line
That's because lines read this way will end up with '\n' at the end, hence the ?
Just add the following at the beginning of your loop:
if line.endswith('\n'):
line = line[:-1]
Or as AKX pointed out in the comments, just:
line = line.rstrip('\r\n')
And so you cover any kind of line ending.

Categories