So the text file I have is formatted something like this:
a
b
c
I know how to strip() and rstrip() but I want to get rid of the empty lines.
I want to make it shorter like this:
a
b
c
You could remove all blank lines (lines that contain only whitespace) from stdin and/or files given at the command line using fileinput module:
#!/usr/bin/env python
import sys
import fileinput
for line in fileinput.input(inplace=True):
if line.strip(): # preserve non-blank lines
sys.stdout.write(line)
You can use regular expressions :
import re
txt = """a
b
c"""
print re.sub(r'\n+', '\n', txt) # replace one or more consecutive \n by a single one
However, lines with spaces won't be removed. A better solution is :
re.sub(r'(\n[ \t]*)+', '\n', txt)
This way, wou will also remove leading spaces.
Simply remove any line that only equals "\n":
in_filename = 'in_example.txt'
out_filename = 'out_example.txt'
with open(in_filename) as infile, open(out_filename, "w") as outfile:
for line in infile.readlines():
if line != "\n":
outfile.write(line)
If you want to simply update the same file, close and reopen it to overwrite it with the new data:
filename = 'in_example.txt'
filedata = ""
with open(filename, "r") as infile:
for line in infile.readlines():
if line != "\n":
filedata += line
with open(filename, "w") as outfile:
outfile.write(filedata)
Related
I want to read a text file that contains Python source code and remove comments and extra whitespace from it.
file.txt (source code file)
#Pythonprogramtofindthefactorialofanumberprovidedbytheuser.
num= 7
factorial=1
ifnum<0:
print("Sorry,factorialdoesnotexistfornegativenumbers")
elifn um==0:
print("Thefactorialof0is1")
else:
foriinrange(1,num+1):
factorial=factorial*i
print("Thefactorialof", num," is", factorial)
I have tried reading the file and using a list comprehension to filter the lines, but it is not working to remove comments and some whitespace is being removed that I want to keep.
with open('file.txt', 'r') as file:
lines = file.readlines()
lines = [line.replace(' ', '') for line in lines]
with open('file.txt', 'w') as file:
file.writelines(lines)
To remove blank lines and trailing whitespace as well as comments, you could use:
import re
with open("file.txt", "r") as file:
for line in file:
line = line.rstrip()
if line:
if not re.match(r'\s*#', line):
file.write(line)
Output
num= 7
factorial=1
ifnum<0:
print("Sorry,factorialdoesnotexistfornegativenumbers")
elifn um==0:
print("Thefactorialof0is1")
else:
foriinrange(1,num+1):
factorial=factorial*i
print("Thefactorialof", num," is", factorial)
I have a text file that looks like this
Big:house
small:door
Big:car
Small:chair
Big:plane
How to I remove the lines that contain the word "big" so it may look like this, I dont want to create a new file all together though
small:door
small:chair
Here was my attempt
with open('QWAS.txt','r') as oldfile:
for line in oldfile:
if bad_words in line:
newfile.write(line)
This is what we can do:
Read data to string (remove rows that start with 'big')
Go to the start of file (seek)
Write the string
Truncate (remove overflow)
And now to the code, open it in read and write mode:
with open('QWAS.txt','r+') as f:
data = ''.join([i for i in f if not i.lower().startswith('big')]) #1
f.seek(0) #2
f.write(data) #3
f.truncate() #4
Try this:
newfile = r'output.txt'
oldfile = r'input.txt'
with open(newfile, 'w') as outfile, open(oldfile, 'r') as infile:
for line in infile:
if if line[:5].lower() == 'small':
outfile.write(line)
#output
small:door
Small:chair
Of course, this assumes you want to eliminate rows where small or Small is to the left of the colon. Additionally, you will have a new file output, as I don't think you really want to update your input file.
You can try using regular expressions
import re
oldfile = open('QWAS.txt','r')
newfile = open('newfile.txt','w')
for line in oldfile:
if re.search('[Ss]mall',line):
newfile.write(line)
oldfile.close()
newfile.close()
Which gives the output file "newfile.txt"
small:door
Small:chair
If you just take every line that doesn't have small and write it to a new file "newfile2.txt"
import re
oldfile = open('QWAS.txt','r')
newfile = open('newfile.txt','w')
newfile2 = open('newfile2.txt','w')
for line in oldfile:
if re.search('[Ss]mall',line):
newfile.write(line)
else:
newfile2.write(line)
oldfile.close()
newfile.close()
newfile2.close()
I would like to make a newline after a dot in a file.
For example:
Hello. I am damn cool. Lol
Output:
Hello.
I am damn cool.
Lol
I tried it like that, but somehow it's not working:
f2 = open(path, "w+")
for line in f2.readlines():
f2.write("\n".join(line))
f2.close()
Could your help me there?
I want not just a newline, I want a newline after every dot in a single file. It should iterate through the whole file and make newlines after every single dot.
Thank you in advance!
This should be enough to do the trick:
with open('file.txt', 'r') as f:
contents = f.read()
with open('file.txt', 'w') as f:
f.write(contents.replace('. ', '.\n'))
You could split your string based on . and store in a list, then just print out the list.
s = 'Hello. I am damn cool. Lol'
lines = s.split('.')
for line in lines:
print(line)
If you do this, the output will be:
Hello
I am damn cool
Lol
To remove leading spaces, you could split based on . (with a space), or else use lstrip() when printing.
So, to do this for a file:
# open file for reading
with open('file.txt') as fr:
# get the text in the file
text = fr.read()
# split up the file into lines based on '.'
lines = text.split('.')
# open the file for writing
with open('file.txt', 'w') as fw:
# loop over each line
for line in lines:
# remove leading whitespace, and write to the file with a newline
fw.write(line.lstrip() + '\n')
In Python, calling e.g. temp = open(filename,'r').readlines() results in a list in which each element is a line from the file. However, these strings have a newline character at the end, which I don't want.
How can I get the data without the newlines?
You can read the whole file and split lines using str.splitlines:
temp = file.read().splitlines()
Or you can strip the newline by hand:
temp = [line[:-1] for line in file]
Note: this last solution only works if the file ends with a newline, otherwise the last line will lose a character.
This assumption is true in most cases (especially for files created by text editors, which often do add an ending newline anyway).
If you want to avoid this you can add a newline at the end of file:
with open(the_file, 'r+') as f:
f.seek(-1, 2) # go at the end of the file
if f.read(1) != '\n':
# add missing newline if not already present
f.write('\n')
f.flush()
f.seek(0)
lines = [line[:-1] for line in f]
Or a simpler alternative is to strip the newline instead:
[line.rstrip('\n') for line in file]
Or even, although pretty unreadable:
[line[:-(line[-1] == '\n') or len(line)+1] for line in file]
Which exploits the fact that the return value of or isn't a boolean, but the object that was evaluated true or false.
The readlines method is actually equivalent to:
def readlines(self):
lines = []
for line in iter(self.readline, ''):
lines.append(line)
return lines
# or equivalently
def readlines(self):
lines = []
while True:
line = self.readline()
if not line:
break
lines.append(line)
return lines
Since readline() keeps the newline also readlines() keeps it.
Note: for symmetry to readlines() the writelines() method does not add ending newlines, so f2.writelines(f.readlines()) produces an exact copy of f in f2.
temp = open(filename,'r').read().split('\n')
Reading file one row at the time. Removing unwanted chars from end of the string with str.rstrip(chars).
with open(filename, 'r') as fileobj:
for row in fileobj:
print(row.rstrip('\n'))
See also str.strip([chars]) and str.lstrip([chars]).
I think this is the best option.
temp = [line.strip() for line in file.readlines()]
temp = open(filename,'r').read().splitlines()
My preferred one-liner -- if you don't count from pathlib import Path :)
lines = Path(filename).read_text().splitlines()
This it auto-closes the file, no need for with open()...
Added in Python 3.5.
https://docs.python.org/3/library/pathlib.html#pathlib.Path.read_text
Try this:
u=open("url.txt","r")
url=u.read().replace('\n','')
print(url)
To get rid of trailing end-of-line (/n) characters and of empty list values (''), try:
f = open(path_sample, "r")
lines = [line.rstrip('\n') for line in f.readlines() if line.strip() != '']
You can read the file as a list easily using a list comprehension
with open("foo.txt", 'r') as f:
lst = [row.rstrip('\n') for row in f]
my_file = open("first_file.txt", "r")
for line in my_file.readlines():
if line[-1:] == "\n":
print(line[:-1])
else:
print(line)
my_file.close()
This script here will take lines from file and save every line without newline with ,0 at the end in file2.
file = open("temp.txt", "+r")
file2 = open("res.txt", "+w")
for line in file:
file2.writelines(f"{line.splitlines()[0]},0\n")
file2.close()
if you looked at line, this value is data\n, so we put splitlines()
to make it as an array and [0] to choose the only word data
import csv
with open(filename) as f:
csvreader = csv.reader(f)
for line in csvreader:
print(line[0])
The global variable originalInfo contains
Joe;Bloggs;j.bloggs#anemail.com;0715491874;1
I have written a function to delete that line in a text file containing more information of this type. It works, but it is really clunky and inelegant.
f = open("input.txt",'r') # Input file
t = open("output.txt", 'w') #Temp output file
for line in f:
if line != originalInfo:
t.write(line)
f.close()
t.close()
os.remove("input.txt")
os.rename('output.txt', 'input.txt')
Is there a more efficient way of doing this? Thanks
You solution nearly works, but you need to take care of the trailing newline. This is bit shorter version, doing what you intend:
import shutil
with open("input.txt",'r') as fin, open("output.txt", 'w') as fout:
for line in fin:
if line.strip() != originalInfo:
fout.write(line)
shutil.move('output.txt', 'input.txt')
The strip() is a bit extra effort but would strip away extra white space.
Alternatively, you could do:
originalInfo += '\n'
and later in the loop:
if line != originalInfo:
You can open the file, read it by readlines(), close it and open it to write again. With this way you don't have to create an output file:
with open('input.txt') as file:
lines = file.readlines
with open('input.txt') as file:
for line in lines:
if line != originalInfo:
file.write(line)
But if you want to have an output:
with open('input.txt') as input:
with open('output.txt', 'w') as output:
for line in input:
if line != originalInfo:
output.write(line)