Skipping Blank lines in read file python - python

Im working on a very long project, i have everything done with it, but in the file he wants us to read at the bottom there are empty spaces, legit just blank spaces that we aren't allowed to delete, to work on the project i deleted them because i have no idea how to get around it, so my current open/read looks like this
file = open("C:\\Users\\bh1337\\Documents\\2015HomicideLog_FINAL.txt" , "r")
lines=file.readlines()[1:]
file.close()
What do i need to add to this to ignore blank lines? or to stop when it gets to a blank line?

You can check if they are empty:
file = open('filename')
lines = [line for line in file.readlines() if line.strip()]
file.close()

for line in file:
if not line.strip():
... do something
Follwoing will be best for readinf files
with open("fname.txt") as file:
for line in file:
if not line.strip():
... do something
With open will takecare of file close.
If you want to ignore lines with only whitespace

Here's a very simple way to skip the empty lines:
with open(file) as f_in:
lines = list(line for line in (l.strip() for l in f_in) if line)

One way is to use the lines list and remove all the elements e such that e.strip() is empty. This way, you can delete all lines with just whitespaces.
Other way is to use f.readline instead of f.readlines() which will read the file line by line. First, initialize an empty list. If the present read-in line, after stripping, is empty, ignore that line and continue to read the next line. Else add the read-in line to the list.
Hope this helps!

Related

Extracting the data from the same position over multiple lines in a string

Fairly simple question but I can't figure out where i'm going wrong. I have a text file which I have split into multiple lines. I want to print a certain location from each line, characters 14 to 20 but when I run the below code it prints a blank set of a characters.
with open('filetxt', 'r') as file:
data = file.read().rstrip()
for line in data:
print(line[14:20])
If you want to read the file line by line, try:
with open('filetxt', 'r') as file:
for line in file:
print(line[14:20])
I think you're using the wrong read() method. read() reads the whole file at once you might want to use readlines() which returns a list of the read lines. I.e.:
with open('filetxt', 'r') as file:
lines = file.readlines()
for line in lines:
print(line[14:20])

skip a line when it has a # character in python?

I would like some help about a problem that I'm facing as a new python programmer. I did a .txt file in c++ where there are some lines starting with # character which mean a comment and I want to skip those lines when I'm reading the file in my python script. How can I do that?
I think this should help you.
I'll read the whole file and save all lines into a list.
Then I'll iterate over this list looking for the first character in every line.
If the first char is equal to "#", go to the next line.
Otherwise, append this line to a new list called selected_lines.
My code isn't super effective, one-liner or etc... but I think this may help you.
lines = []
selected_lines = []
filepath = "/usr//home/Desktop/myfile.txt"
with open(filepath, "r") as f:
lines.append(f.readlines())
for line in lines:
if line[0:1] == "#":
continue
else:
selected_lines.append(line)
Something like this would work if it's just the beginning character. If you need it to ignore comments after code, you would need to modify it to if '#' in line: and handle it accordingly.
with open('somefile.txt', 'r') as f:
for line in f:
# Use continue so your code doesn't become a nested mess.
# if this check passes, we can assume line is not a comment.
if line[0] == '#':
continue
# Do stuff with line after checking for the comment.

How to print lines with a certain length in a file(Python)

I am new to python. I have a document that has one random word per line. There are thousands of words in this file. I am trying to print only the words that are four letters long. I tried this:
f=open("filename.txt")
Words=f.readlines()
for line in f:
if len(line)==4:
print(line)
f.close()
But python is blank when I do this. I am assuming I need to strip the blank spaces as well, but when I did
f.strip()
I received an error stating that .strip() doesn't apply to list items. Any help is grateful. Thanks!
'Python is blank' because you attempt to iterate over the file for a second time.
The first time is with readlines(), so when that iteration is finished you are at the end of the file. Then when you do for line in f you are already at the end of the file so there is nothing left over which to iterate. To fix this, drop the call to readlines().
To do what you want to have, I would just do this:
with open('filename.txt') as f:
for line in f: # No need for `readlines()`
word = line.strip() # Strip the line, not the file object.
if len(word) == 4:
print(word)
Your other error occurs with f.strip() because f is a file object- but you only strip a string. Therefore just split the line on each iteration as shown in the example above.
You should do:
for line in Words:
instead of
for line in f:
You want line.strip() because f is a file object, not a string.

Remove whitespaces in the beginning of every string in a file in python?

How to remove whitespaces in the beginning of every string in a file with python?
I have a file myfile.txt with the strings as shown below in it:
_ _ Amazon.inc
Arab emirates
_ Zynga
Anglo-Indian
Those underscores are spaces.
The code must be in a way that it must go through each and every line of a file and remove all those whitespaces, in the beginning of a line.
I've tried using lstrip but that's not working for multiple lines and readlines() too.
Using a for loop can make it better?
All you need to do is read the lines of the file one by one and remove the leading whitespace for each line. After that, you can join again the lines and you'll get back the original text without the whitespace:
with open('myfile.txt') as f:
line_lst = [line.lstrip() for line in f.readlines()]
lines = ''.join(line_lst)
print lines
Assuming that your input data is in infile.txt, and you want to write this file to output.txt, it is easiest to use a list comprehension:
inf = open("infile.txt")
stripped_lines = [l.lstrip() for l in inf.readlines()]
inf.close()
# write the new, stripped lines to a file
outf = open("output.txt", "w")
outf.write("".join(stripped_lines))
outf.close()
To read the lines from myfile.txt and write them to output.txt, use
with open("myfile.txt") as input:
with open("output.txt", "w") as output:
for line in input:
output.write(line.lstrip())
That will make sure that you close the files after you're done with them, and it'll make sure that you only keep a single line in memory at a time.
The above code works in Python 2.5 and later because of the with keyword. For Python 2.4 you can use
input = open("myfile.txt")
output = open("output.txt", "w")
for line in input:
output.write(line.lstrip())
if this is just a small script where the files will be closed automatically at the end. If this is part of a larger program, then you'll want to explicitly close the files like this:
input = open("myfile.txt")
try:
output = open("output.txt", "w")
try:
for line in input:
output.write(line.lstrip())
finally:
output.close()
finally:
input.close()
You say you already tried with lstrip and that it didn't work for multiple lines. The "trick" is to run lstrip on each individual line line I do above. You can try the code out online if you want.

How to delete parts of a file in python?

I have a file named a.txt which looks like this:
I'm the first line
I'm the second line.
There may be more lines here.
I'm below an empty line.
I'm a line.
More lines here.
Now, I want to remove the contents above the empty line(including the empty line itself).
How could I do this in a Pythonic way?
Basically you can't delete stuff from the beginning of a file, so you will have to write to a new file.
I think the pythonic way looks like this:
# get a iterator over the lines in the file:
with open("input.txt", 'rt') as lines:
# while the line is not empty drop it
for line in lines:
if not line.strip():
break
# now lines is at the point after the first paragraph
# so write out everything from here
with open("output.txt", 'wt') as out:
out.writelines(lines)
Here are some simpler versions of this, without with for older Python versions:
lines = open("input.txt", 'rt')
for line in lines:
if not line.strip():
break
open("output.txt", 'wt').writelines(lines)
and a very straight forward version that simply splits the file at the empty line:
# first, read everything from the old file
text = open("input.txt", 'rt').read()
# split it at the first empty line ("\n\n")
first, rest = text.split('\n\n',1)
# make a new file and write the rest
open("output.txt", 'wt').write(rest)
Note that this can be pretty fragile, for example windows often uses \r\n as a single linebreak, so a empty line would be \r\n\r\n instead. But often you know the format of the file uses one kind of linebreaks only, so this could be fine.
Naive approach by iterating over the lines in the file one by one top to bottom:
#!/usr/bin/env python
with open("4692065.txt", 'r') as src, open("4692065.cut.txt", "w") as dest:
keep = False
for line in src:
if keep: dest.write(line)
if line.strip() == '': keep = True
The fileinput module (from the standard library) is convenient for this kind of thing. It sets things up so you can act as though your are editing the file "in-place":
import fileinput
import sys
fileobj=iter(fileinput.input(['a.txt'], inplace=True))
# iterate through the file until you find an empty line.
for line in fileobj:
if not line.strip():
break
# Iterators (like `fileobj`) pick up where they left off.
# Starting a new for-loop saves you one `if` statement and boolean variable.
for line in fileobj:
sys.stdout.write(line)
Any idea how big the file is going to be?
You could read the file into memory:
f = open('your_file', 'r')
lines = f.readlines()
which will read the file line by line and store those lines in a list (lines).
Then, close the file and reopen with 'w':
f.close()
f = open('your_file', 'w')
for line in lines:
if your_if_here:
f.write(line)
This will overwrite the current file. Then you can pick and choose which lines from the list you want to write back in. Probably not a very good idea if the file gets to large though, since the entire file has to reside in memory. But, it doesn't require that you create a second file to dump your output.
from itertools import dropwhile, islice
def content_after_emptyline(file_object):
return islice(dropwhile(lambda line: line.strip(), file_object), 1, None)
with open("filename") as f:
for line in content_after_emptyline(f):
print line,
You could do a little something like this:
with open('a.txt', 'r') as file:
lines = file.readlines()
blank_line = lines.index('\n')
lines = lines[blank_line+1:] #\n is the index of the blank line
with open('a.txt', 'w') as file:
file.write('\n'.join(lines))
and that makes the job much simpler.

Categories