I would like some help about a problem that I'm facing as a new python programmer. I did a .txt file in c++ where there are some lines starting with # character which mean a comment and I want to skip those lines when I'm reading the file in my python script. How can I do that?
I think this should help you.
I'll read the whole file and save all lines into a list.
Then I'll iterate over this list looking for the first character in every line.
If the first char is equal to "#", go to the next line.
Otherwise, append this line to a new list called selected_lines.
My code isn't super effective, one-liner or etc... but I think this may help you.
lines = []
selected_lines = []
filepath = "/usr//home/Desktop/myfile.txt"
with open(filepath, "r") as f:
lines.append(f.readlines())
for line in lines:
if line[0:1] == "#":
continue
else:
selected_lines.append(line)
Something like this would work if it's just the beginning character. If you need it to ignore comments after code, you would need to modify it to if '#' in line: and handle it accordingly.
with open('somefile.txt', 'r') as f:
for line in f:
# Use continue so your code doesn't become a nested mess.
# if this check passes, we can assume line is not a comment.
if line[0] == '#':
continue
# Do stuff with line after checking for the comment.
Related
I am writing in python 3.6 and am having trouble making my code match strings in a short text document. this is a simple example of the exact logic that is breaking my bigger program:
PATH = "C:\\Users\\JoshLaptop\\PycharmProjects\\practice\\commented.txt"
file = open(PATH, 'r')
words = ['bah', 'dah', 'gah', "fah", 'mah']
print(file.read().splitlines())
if 'bah' not in file.read().splitlines():
print("fail")
with the text document formatted like so:
bah
gah
fah
dah
mah
and it is indeed printing out fail each time I run this. Am I using the incorrect method of reading the data from the text document?
the issue is that you're printing print(file.read().splitlines())
so it exhausts the file, and the next call to file.read().splitlines() returns an empty list...
A better way to "grep" your pattern would be to iterate on the file lines instead of reading it fully. So if you find the string early in the file, you save time:
with open(PATH, 'r') as f:
for line in f:
if line.rstrip()=="bah":
break
else:
# else is reached when no break is called from the for loop: fail
print("fail")
The small catch here is not to forget to call line.rstrip() because file generator issues the line with the line terminator. Also, if there's a trailing space in your file, this code will still match the word (make it strip() if you want to match even with leading blanks)
If you want to match a lot of words, consider creating a set of lines:
lines = {line.rstrip() for line in f}
so your in lines call will be a lot faster.
Try it:
PATH = "C:\\Users\\JoshLaptop\\PycharmProjects\\practice\\commented.txt"
file = open(PATH, 'r')
words = file.read().splitlines()
print(words)
if 'bah' not in words:
print("fail")
You can't read the file two times.
When you do print(file.read().splitlines()), the file is read and the next call to this function will return nothing because you are already at the end of file.
PATH = "your_file"
file = open(PATH, 'r')
words = ['bah', 'dah', 'gah', "fah", 'mah']
if 'bah' not in (file.read().splitlines()) :
print("fail")
as you can see output is not 'fail' you must use one 'file.read().splitlines()' in code or save it in another variable otherwise you have an 'fail' message
Im working on a very long project, i have everything done with it, but in the file he wants us to read at the bottom there are empty spaces, legit just blank spaces that we aren't allowed to delete, to work on the project i deleted them because i have no idea how to get around it, so my current open/read looks like this
file = open("C:\\Users\\bh1337\\Documents\\2015HomicideLog_FINAL.txt" , "r")
lines=file.readlines()[1:]
file.close()
What do i need to add to this to ignore blank lines? or to stop when it gets to a blank line?
You can check if they are empty:
file = open('filename')
lines = [line for line in file.readlines() if line.strip()]
file.close()
for line in file:
if not line.strip():
... do something
Follwoing will be best for readinf files
with open("fname.txt") as file:
for line in file:
if not line.strip():
... do something
With open will takecare of file close.
If you want to ignore lines with only whitespace
Here's a very simple way to skip the empty lines:
with open(file) as f_in:
lines = list(line for line in (l.strip() for l in f_in) if line)
One way is to use the lines list and remove all the elements e such that e.strip() is empty. This way, you can delete all lines with just whitespaces.
Other way is to use f.readline instead of f.readlines() which will read the file line by line. First, initialize an empty list. If the present read-in line, after stripping, is empty, ignore that line and continue to read the next line. Else add the read-in line to the list.
Hope this helps!
I have a text file that I needs to manipulate. I want to add a line after occurence of word "exactarch". Means whenever "exactarch" occurs, I want to add text in the next line.
E.g. If this is the original file content,
[main]
cachedir=/var/cache/yum
keepcache=0
debuglevel=2
logfile=/var/log/yum.log
distroverpkg=redhat-release
tolerant=1
exactarch=1
gpgcheck=1
plugins=1
I want to change it as below:
[main]
cachedir=/var/cache/yum
keepcache=0
debuglevel=2
logfile=/var/log/yum.log
distroverpkg=redhat-release
tolerant=1
exactarch=1
obsoletes=1
gpgcheck=1
plugins=1
This is what I tried to do:
with open('file1.txt') as f:
for line in input_data:
if line.strip() == 'exactarch':
f.write('obsoletes=1')
Obviously this is not working as I can't figure out how can I count and write to this line.
You ask for a Python solution. But tasks like this are made to be solved using simpler tools.
If you are using a system that has sed, you can do this in a simle one-liner:
$ sed '/exactarch/aobsoletes=1' < in.txt
What does this mean?
sed: the executable
/exactarch/: matches all lines that contain exactarch
a: after the current line, append a new line with the following text
obsoletes=1: the text to append in a new line
Output:
[main]
cachedir=/var/cache/yum
keepcache=0
debuglevel=2
logfile=/var/log/yum.log
distroverpkg=redhat-release
tolerant=1
exactarch=1
obsoletes=1
gpgcheck=1
plugins=1
Edit:
To modify the file in place, use the option -i and the file as an argument:
$ sed -i '/exactarch/aobsoletes=1' in.txt
Simple - read all lines, find correct line and insert desired line after found. Dump result lines to file.
import os
with open('lines.txt') as f:
lines = f.readlines()
lines.insert(lines.index('exactarch=1\n') + 1, 'obsoletes=1\n')
with open('dst.txt', 'w') as f:
for l in lines:
f.write(l)
The past says it's pretty simple - replacing words in files is not a new thing.
If you want to replace a word, you can use the solution implemented there. In your context:
import fileinput
for line in fileinput.input(fileToSearch, inplace=True):
print(line.replace("exactarch", "exactarch\nobsoletes=1"), end='')
I am hesitant using fileinput, b/c if something goes wrong during the 'analysis' phase you are left with a file in whatever conditions it was left before the failure. I would read everything in, and then do full work on it. The code below ensures that:
Your inserted value contains a newline value '\n' if it's not going to be the last item.
Will not add duplicate inserted values by checking the one below it.
Iterates through all values incase multiple "exactarch=1"s were added since the snippet last ran.
Hope this helps, albeit not as stylish as a one/two liner.
with open('test.txt') as f:
data = f.readlines()
insertValue = 'obsoletes=1'
for item in data:
if item.rstrip() == 'exactarch=1': #find it if it's in the middle or the last line (ie. no '\n')
point = data.index(item)
if point+1 == len(data): #Will be inserted as new line since current exactarch=1 is in last position, so you don't want the '\n', right?
data.insert(point+1, instertValue)
else:
if data[point + 1].rstrip() != insertValue: #make sure the value isn't already below exactarch=1
data.insert(point+1, insertValue + '\n')
print('insertValue added below "exactarch=1"')
else:
print('insertValue already exists below exactarch=1')
with open('test.txt','w') as f:
f.writelines(data)
I need to edit my file and save it so that I can use it for another program . First I need to put "," in between every word and add a word at the end of every line.
In order to put "," in between every word , I used this command
for line in open('myfile','r+') :
for word in line.split():
new = ",".join(map(str,word))
print new
I'm not too sure how to overwrite the original file or maybe create a new output file for the edited version . I tried something like this
with open('myfile','r+') as f:
for line in f:
for word in line.split():
new = ",".join(map(str,word))
f.write(new)
The output is not what i wanted (different from the print new) .
Second, I need to add a word at the end of every line. So, i tried this
source = open('myfile','r')
output = open('out','a')
output.write(source.read().replace("\n", "yes\n"))
The code to add new word works perfectly. But I was thinking there should be an easier way to open a file , do two editing in one go and save it. But I'm not too sure how. Ive spent a tremendous amount of time to figure out how to overwrite the file and it's about time I seek for help
Here you go:
source = open('myfile', 'r')
output = open('out','w')
output.write('yes\n'.join(','.join(line.split()) for line in source.read().split('\n')))
One-liner:
open('out', 'w').write('yes\n'.join(','.join(line.split() for line in open('myfile', 'r').read().split('\n')))
Or more legibly:
source = open('myfile', 'r')
processed_lines = []
for line in source:
line = ','.join(line.split()).replace('\n', 'yes\n')
processed_lines.append(line)
output = open('out', 'w')
output.write(''.join(processed_lines))
EDIT
Apparently I misread everything, lol.
#It looks like you are writing the word yes to all of the lines, then spliting
#each word into letters and listing those word's letters on their own line?
source = open('myfile','r')
output = open('out','w')
for line in source:
for word in line.split():
new = ",".join(word)
print >>output, new
print >>output, 'y,e,s'
How big is this file?
Maybe You could create a temporary list which would just contain everything from file you want to edit. Every element could represent one line.
Editing list of strings is pretty simple.
After Your changes you can just open Your file again with
writable = open('configuration', 'w')
and then put changed lines to file with
file.write(writable, currentLine + '\n')
.
Hope that helps - even a little bit. ;)
For the first problem, you could read all the lines in f before overwriting f, assuming f is opened in 'r+' mode. Append all the results into a string, then execute:
f.seek(0) # reset file pointer back to start of file
f.write(new) # new should contain all concatenated lines
f.truncate() # get rid of any extra stuff from the old file
f.close()
For the second problem, the solution is similar: Read the entire file, make your edits, call f.seek(0), write the contents, f.truncate() and f.close().
I'm trying to write a Python script that uses a particular external application belonging to the company I work for. I can generally figure things out for myself when it comes to programming and scripting, but this time I am truely lost!
I can't seem to figure out why the while loop wont function as it is meant to. It doesn't give any errors which doesn't help me. It just seems to skip past the important part of the code in the centre of the loop and then goes on to increment the "count" like it should afterwards!
f = open('C:/tmp/tmp1.txt', 'w') #Create a tempory textfile
f.write("TEXTFILE\nTEXTFILE\nTEXTFILE\nTEXTFILE\nTEXTFILE\nTEXTFILE\n") #Put some simple text in there
f.close() #Close the file
count = 0 #Insert the line number from the text file you want to begin with (first line starts with 0)
num_lines = sum(1 for line1 in open('C:/tmp/tmp1.txt')) #Get the number of lines from the textfile
f = open('C:/tmp/tmp2.txt', 'w') #Create a new textfile
f.close() #Close it
while (count < num_lines): #Keep the loop within the starting line and total number of lines from the first text file
with open('C:/tmp/tmp1.txt', 'r') as f: #Open the first textfile
line2 = f.readlines() #Read these lines for later input
for line2[count] in f: #For each line from chosen starting line until last line from first text file,...
with open('C:/tmp/tmp2.txt', 'a') as g: #...with the second textfile open for appending strings,...
g.write("hello\n") #...write 'hello\n' each time while "count" < "num_lines"
count = count + 1 #Increment the "count"
I think everything works up until: "for line2[count] in f:"
The real code I'm working on is somewhat more complicated, and the application I'm using isn't exactly for sharing, so I have simplified the code to give silly outputs instead just to fix the problem.
I'm not looking for alternative code, I'm just looking for a reason why the loop isn't working so I can try to fix it myself.
All answers will be appreciated, and thanking everyone in advance!
Cormac
Some comments:
num_lines = sum(1 for line1 in open('C:/tmp/tmp1.txt'))
Why? What's wrong with len(open(filename, 'rb').readlines())?
while (count < num_lines):
...
count = count + 1
This is bad style, you could use:
for i in range(num_lines):
...
Note that I named your index i, which is universally recognized, and that I used range and a for loop.
Now, your problem, like I said in the comment, is that f is a file (that is, a stream of bytes with a location pointer) and you've read all the lines from it. So when you do for line2[count] in f:, it will try reading a line into line2[count] (this is a bit weird, actually, you almost never use a for loop with a list member as an index but apparently you can do that), see that there's no line to read, and never executes what's inside the loop.
Anyway, you want to read a file, line by line, starting from a given line number? Here's a better way to do that:
from itertools import islice
start_line = 0 # change this
filename = "foobar" # also this
with open(filename, 'rb') as f:
for line in islice(f, start_line, None):
print(line)
I realize you don't want alternative code, but your code really is needlessly complicated.
If you want to iterate over the lines in the file f, I suggest replacing your "for" line with
for line in line2:
# do something with "line"...
You put the lines in an array called line2, so use that array! Using line2[count] as a loop variable doesn't make sense to me.
You seem to get it wrong how the 'for line in f' loop works. It iterates over a file and calls readline, until there are no lines to read. But at the moment you start the loop all the lines are already read(via f.readlines()) and file's current position is at end. You can achieve what you want by calling f.seek(0), but that doesn't seem to be a good decision anyway, since you're going to read file again and that's slow IO.
Instead you want to do smth like:
for line in line2[count:]: # iterate over lines read, starting with `count` line
do_smth_with(line)