Python Remove Whitespace - python

This is my first question so forgive me if this is not a well formed question but I am trying to read the contents of a file in Python.
So far I could print the contents of the file but there are whitespaces at the beginning and end of each line and I don't want the whitespaces on the beginning. How do I do that.
with open('dump.txt','r') as f:
print f.read()
Thanks!

You can do something like this.
with open('dump.txt','r') as f:
for line in f:
print line.lstrip()
lstrip specifically removes the whitespace from the beginning of the string.
PS. read gives you the whole content of the file, you better operate on the line level by readline
UPDATE:
As pointed out, there are severals ways of doing this and one other way is to read the contents of the file via readlines and iterate through that list to strip the whitespace.

this reads the file line for line and left-strips each line
with open('dump.txt','r') as file:
for line in file:
print line.lstrip()

To cut off trailing or leading white spaces you can do
>>' Test '.lstrip()
'Test '
or
>>'Test '.rstrip()
'Test '
or
>>'Test '.strip()
'Test'

Related

How to write to a file with newline characters and avoid empty lines

I'm trying to write encoded data to a file and separate each run with a newline character. However, when doing this there is an empty line between each run -- as shown below.
Using .rstrip()/.strip() only really works when reading the file -- and obviously this cannot be used directly when writing to the file as it would write all the data to a single line.
cFile = open('compFile', 'w')
for i in range(num_lines):
line = validLine()
compressedFile.write(line + "\n")
cFile.close()
cFile = open('compFile', 'r')
for line in cFile:
print(line)
# Empty space output:
023
034
045
# Desired output:
023
034
045
I think you already did what you want if you have a look at your text file.
Be aware, that python reads the \n at the end of your file too and that print() makes a newline at the end of the printed line.
In your case that means your file should look like
023\n
034\n
045\n
When printing, you at first read 023\n and then as python does with the print() function you append a \n to your line.
Then you have the 023\n\n you get in your console. But in the file you have what you want.
If you just want to print without linebreak, you can use
import sys
sys.stdout.write('.')
You could use
for i in range(num_lines):
line = validLine()
compressedFile.write(line.strip() + "\n")
# ^^^
cFile.close()
Off-topic but consider using with () additionally.
Using .rstrip()/.strip() only really works when reading the file -- and obviously this cannot be used directly when writing to the file as it would write all the data to a single line.
This is a misconception. Using .rstrip() is exactly the correct tool if you need to write a series of strings, some of which may have a newline character attached:
with open('compFile', 'w') as cFile:
for i in range(num_lines):
line = validLine().rstrip("\n") # remove possible newline
compressedFile.write(line + "\n")
Note that if all your lines already have a newline attached, you don't have to add more newlines. Just write the string directly to the file, no stripping needed:
with open('compFile', 'w') as cFile:
for i in range(num_lines):
line = validLine() # line with "\n" newline already present
compressedFile.write(line) # no need to add a newline anymore
Next, you are reading lines with newlines from your file and then printing them with print(). By default, print() adds another newline, so you end up with double-spaced lines; your input file contains 023\n034\n045\n, but printing each line ('023\n', then '034\n', then '045\n') adds a newline afterwards and you write out 023\n\n034\n\n045\n\n out to stdout.
Either strip that newline when printing, or tell print() to not add a newline of its own by giving it an empty end parameter:
with open('compFile', 'r') as cFile:
for line in cFile:
print(line, end='')

Python Printing Quotation Marks on Two Lines Instead of One

I am reading from an external text file, named 'greeting.txt' where the contents of the text file are simply:
HELLO
However, when I attempt to print the contents of the text file enclosed in quotes the terminal prints out:
"HELLO
"
I am using the following code:
for line in open('greeting.txt', "r"): print ('"%s"' % line)
I desire the string to be enclosed in quotes printed on the same line.
I have never encountered this problem before despite using Python for similar purposes, any help would be appreciated.
There is a end of line character in your text file after Hello. That end of line is also getting enclosed in the double quotes and causing the second quote to get printed on the second line. You should strip the end of line using rstrip()
for line in open('greeting.txt', "r"): print ('"%s"' % line.rstrip())
The problem is that, what is written in your file is probably Hello\n and if you read the whole line you are then printing "Hello\n" which causes the newline to be in front of the second quote. Use the strip() method to get rid of any trailing whitespaces like so:
for line in open('greeting.txt', "r"): print ('"%s"' % line.strip())
However I would suggest changing your code to:
with open('greeting.txt', "r") as f:
for line in f: print ('"%s"' % line.strip())
Since I personally do not like to have open without making sure, that the file is being closed as soon as I am done with it.
You can strip the trailing whitespaces using the rstrip() function.
for line in open('greeting.txt', "r"): print ('"%s"' % line.rstrip())

Stripping WhiteSpace & Comments using python file executable from Command Line

Locked. There are disputes about this question’s content being resolved at this time. It is not currently accepting new answers or interactions.
Trying to write a program that takes in, from the command line, my executable file + an optional argument called 'no-comments' and the file.
So if someone writes in command line: stripWhiteSpace.py file.rtf
Then it will strip all the whitespace EXCEPT new lines.
If someone writes in command line: stripWhiteSpace.py no-comments file.rtf
Then it will strip all the whitespace except new lines, AND also remove all C/C++/Java style comments starting with "//" and anything that comes after that (that whole comment).
Here is my code (called stripWhiteSpace.py):
import sys
file = sys.argv[-1]
with open(file, 'r+') as f:
final_file = ""
if sys.argv[1] == "no-comments":
for line in f:
line = line.partition('//')[0]
line = line.strip(' \t\r')
final_file += line
else:
for line in f:
line = line.strip(' \t\r')
final_file += line
f.write(final_file)
The file is successfully passed through my python file. The problem is, it doesn't change. Any help is appreciated.
The problem is how you are using strip(). Doing line.strip(' \t\r') will strip only ' \t\r' ie a space, followed by a tab, followed by a carriage return. I dont think that is going to happen very often. The syntax should be line.strip([' ', '\t', '\r']).
Having said all that, if you are wanting to remove ALL white space, strip wont do that, it will only remove the white space at the start or end of the line. If you want to remove ALL the whitespace you need to use .replace().

Python is adding extra newline to the output

The input file: a.txt
aaaaaaaaaaaa
bbbbbbbbbbb
cccccccccccc
The python code:
with open("a.txt") as f:
for line in f:
print line
The problem:
aaaaaaaaaaaa
bbbbbbbbbbb
cccccccccccc
as you can see the output has extra line between each item.
How to prevent this?
print appends a newline, and the input lines already end with a newline.
A standard solution is to output the input lines verbatim:
import sys
with open("a.txt") as f:
for line in f:
sys.stdout.write(line)
PS: For Python 3 (or Python 2 with the print function), abarnert's print(…, end='') solution is the simplest one.
As the other answers explain, each line has a newline; when you print a bare string, it adds a line at the end. There are two ways around this; everything else is a variation on the same two ideas.
First, you can strip the newlines as you read them:
with open("a.txt") as f:
for line in f:
print line.rstrip()
This will strip any other trailing whitespace, like spaces or tabs, as well as the newline. Usually you don't care about this. If you do, you probably want to use universal newline mode, and strip off the newlines:
with open("a.txt", "rU") as f:
for line in f:
print line.rstrip('\n')
However, if you know the text file will be, say, a Windows-newline file, or a native-to-whichever-platform-I'm-running-on-right-now-newline file, you can strip the appropriate endings explicitly:
with open("a.txt") as f:
for line in f:
print line.rstrip('\r\n')
with open("a.txt") as f:
for line in f:
print line.rstrip(os.linesep)
The other way to do it is to leave the original newline, and just avoid printing an extra one. While you can do this by writing to sys.stdout with sys.stdout.write(line), you can also do it from print itself.
If you just add a comma to the end of the print statement, instead of printing a newline, it adds a "smart space". Exactly what that means is a bit tricky, but the idea is supposed to be that it adds a space when it should, and nothing when it shouldn't. Like most DWIM algorithms, it doesn't always get things right—but in this case, it does:
with open("a.txt") as f:
for line in f:
print line,
Of course we're now assuming that the file's newlines match your terminal's—if you try this with, say, classic Mac files on a Unix terminal, you'll end up with each line printing over the last one. Again, you can get around that by using universal newlines.
Anyway, you can avoid the DWIM magic of smart space by using the print function instead of the print statement. In Python 2.x, you get this by using a __future__ declaration:
from __future__ import print_function
with open("a.txt") as f:
for line in f:
print(line, end='')
Or you can use a third-party wrapper library like six, if you prefer.
What happens is that each line as a newline at the end, and print statement in python also adds a newline. You can strip the newlines:
with open("a.txt") as f:
for line in f:
print line.strip()
You could also try the splitlines() function, it strips automatically:
f = open('a.txt').read()
for l in f.splitlines():
print l
It is not adding a newline, but each scanned line from your file has a trailing one.
Try:
with open ("a.txt") as f:
for line in (x.rstrip ('\n') for x in f):
print line

Remove backslash instances from a file in Python

Okay, this may sound like a stupid question, but I can't solve this problem...
I need remove all instances of backslash from downloaded file... But,
output.replace("\","")
doesn't work. Python considers "\"," a string, rather than "\" one string and "" the other one.
How can I remove backslashes?
EDIT:
New problem...
Originally, downloaded file had to be processed, which I did using:
fn = "result_cache.txt"
f = open(fn)
output = []
for line in f:
if content in line:
output.append(line)
f.close()
f = open(fn, "w")
f.writelines(output)
f.close()
output=str(output)
#irrelevant stuff
with open("result_cache.txt", "wt") as out:
out.write(output.replace("\\n","\n"))
Which worked okay, reducing file's content to only one line...
And finally ended with having this contents only:
Line of text\
Another line of text\
There\\\'s more text here\
Last line of text
I can't use the same thing again, because it would transform every line to a value in a list, leaving brackets and commas... So, I need to have:
out.write(output.replace("\\n","\n"))
out.write(output.replace("\\",""))
in the same line... How? Or is there another way?
Just escape the backslash with a backslash:
output.replace("\\","")

Categories