Python I/O not creating new line - python

I have the following code:
lines[usernum] = str(user) + "\n"
f = open('users.txt','w')
userstr = str(lines)
f.write(userstr)
Effectively I am modifying the lines list at usernum, then writing it back to the original file 'users.txt'. However, even adding the "\n", everything gets written to one line rather than an individual line for each line in lines. Why is this?

That is because str(lines) will not do what you expect it to do:
>>> lst = ['This', 'is a\n', 'fancy\n', 'list']
>>> print(str(lst))
['This', 'is a\n', 'fancy\n', 'list']
As you can see, it will write a representation of the list, and not print out the individual lines. You can use str.join to combine the list elements:
lines[usernum] = str(user)
with open('users.txt', 'w') as f:
f.write('\n'.join(lines))
Using '\n' as the join character will make sure that a newline is inserted between each list element, so you do not need to take care of adding it to every item manually.
Furthermore, when working with files, it is recommended to use the with statement when opening them, to make sure that they are closed correctly when you are done working with them.

You are writing the string representation of the list to the file.
String representations of Python containers use the representation of the contents, not directly write strings.
Use the writelines() function to write a sequence of strings:
f.writelines(lines)
or write each line separately:
for line in lines:
f.write(line)
or join the lines into one long string:
f.write(''.join(lines))
The latter also allows you to add the newlines at the time of writing, by using \n as the joining string:
f.write('\n'.join(lines))
where lines contains strings without newlines.

Use a for-loop and '\n' to write each item on a new line:
with open('users.txt','w') as f:
for line in lines:
f.write('{}\n'.format(line))
Calling str() on a python object like list or dict simply returns the str version of that object, to write it's content to a file you should loop over it.
Demo:
lines = [1, 'foo', 3, 'bar']
with open('users.txt','w') as f:
for line in lines:
f.write('{}\n'.format(line))
...
>>> !cat users.txt
1
foo
3
bar

Related

Why does the function "len()" return an answer that is 1 character longer than the actual string?

I have created a Python program that removes words from a list if they are not a certain length. I have set up a for loop that cycles through my list and checks if each word is a length of 3 or greater. My code is as follows:
import string
text_file = open("ten-thousand-english-words.txt", "r")
lines = text_file.readlines()
text_file.close()
open('SortedWords.txt', 'w').close()
for i in lines:
print(len(i))
if len(i) >= 4:
sortedFile = open("SortedWords.txt", "a") # append mode
sortedFile.write(i)
sortedFile.close()
I wanted to create a new file that only copies the word over if it is 3 characters or longer.
For some reason it reads all the words in the list as 1 character longer than they actually are (e.g. the word “Hello” would return a length of 6 even though the number of letters is 5).
I fixed this by making it so that the length it looks for is 4 instead of 3, and it worked properly. I couldn't find any information about this issue online, so I decided to post this in case anyone knows why this happens.
Each line in a file has a "\n" at the end of it which indicates a newline. We can't see this character with a text editor, since the text editor automatically converts it to a new line, but rest assured it's there. When you read a file in python using readlines(), this "\n" character is preserved. This is why you are getting a length of 1 more than expected.
Here's some code to understand what's going on:
somefile.txt
apple
banana
cow
script.py
with open("somefile.txt") as fi:
for line in fi.readlines():
print(repr(line))
>>> 'apple\n'
>>> 'banana\n'
>>> 'cow\n'
The repr function in python will print the literal representation of the string (ie it won't write a newline when it sees "\n", it will just print it as is). If we didn't use repr before printing, our output would be:
apple
banana
cow
Notice there are extra lines in between each string since python is printing the 2 newline characters: 1 from the string itself, and 1 which is added to the end by default from the print function.
To get rid of the new line character, we can use my_string.strip(), which will removing any trailing or leading whitespace:
with open("somefile.txt") as fi:
for line in fi.readlines():
print(repr(line.strip()))
>>> 'apple'
>>> 'banana'
>>> 'cow'
Every line ends with '\n', and that's the '+1' you see
Your program may be as simple as
with open("ten-thousand-english-words.txt", "r") as lines:
with open("SortedWords.txt", "w") as sortedFile:
for line in lines:
if len(line) >= 4:
sortedFile.write(line)
The analysis of your program and the explanation of the mine one:
Other people explained you why the lengths are longer, so instead of >= 3 you correctly used >= 4.
Your import string is useless.
Your command
open('SortedWords.txt', 'w').close()
is useless (simply remove it and make the changes which I describe below), because it opens the file and immediately closes it, effectively
creating an empty file, if it doesn't already exist,
empties its content, if it existed.
Once more, it is useless.
It seems that the only reason for doing it is your later command for repeatedly opening an empty file in the append mode:
if len(i) >= 4:
sortedFile = open("SortedWords.txt", "a") # append mode
sortedFile.write(i)
But:
Opening an already opened file does nothing.
Why open / close a file repeatedly? You may simply open it in the write mode and then write to it in your loop:
sortedFile = open("SortedWords.txt", "w") # NO append mode - WRITE mode
if len(i) >= 5:
sortedFile.write(i)
Instead of manually closing an opened file, use the so-called context manager (i.e. the with statement) for automatically closing it:
with open(...) as f:
...
...
# after dedent the file is automatically closed
To make the program more bullet-proof, remove eventual whitespaces before / after words (including the \n) using the .strip() method.
In this case
use the >= 3 comparison,
add the \n symbol when writing a word (i.e. the line) to the file:
with open("ten-thousand-english-words.txt", "r") as lines:
with open("SortedWords.txt", "w") as sortedFile:
for line in lines:
line = line.strip()
if len(line) >= 3:
sortedFile.write(line + "\n")

How to define/ingest a hard-coded multi-line list without using quotes

I have a script that operates on elements of a list
The list is hard-coded at the top of the script and is edited periodically
When adding/removing items it would be ideal not to have to "quote" each item (especially since the users that may edit it may insert entries that have quotes and need to be escaped)
i.e. right now the list is defined as:
blah = [
'banana1',
'banana2',
'banana3'
]
If I wanted to add ban'ana4 then it would look like:
blah = [
'banana1',
'banana2',
'banana3',
'ban\'ana4'
]
Is there a more elegant way to do this other than making it a multi-line text string and then splitting on linebreaks?
I agree with #snakecharmerb's suggestion. It's less error-prone to store string values in a text file and load them whenever you run your Python program. For example, if you store the list items in the text file "test.txt"
test.txt
banana1
banana2
banana3
ban'ana4
Then you can load the list of strings into your program by reading the content in the "test.txt" file:
FILENAME = 'test.txt'
blah = []
with open(FILENAME) as f:
for line in f:
# cut off newline characters \n and \r
l = line.split('\n')[0].split('\r')[0]
blah.append(l)
Unless it is absolutely necessary to keep the list in the script file, I would read the data from a text file instead. This way any quoting is handled automatically by Python, and their is no risk of typos corrupting the script.
This wouldn't work if some elements of the list are not strings, but that doesn't seem to be likely in your case.
with open('somefile.txt') as f:
mylist = [line.strip() for line in f]
# do stuff with list
You can use split to avoid quoting, this is a very common idiom in python. Here is a example from repl
>>> '''foo
... bar'tar
... zar'''.split()
['foo', "bar'tar", 'zar']
Just note that line breaks are white spaces here, so split() just works. You will need to remove the indentation of those lines leading to another problem, this can be fixed by removing left spaces after splitting, which can be with something like the bellow
import re
from operator import truth
class R:
def __rsub__(self, string):
return list(filter(truth, re.split(r'\n\s*', string)))
R = R()
def foo():
s = '''
foo
bar
tar'zar''' - R
print(s)
foo()
Just give R a better name :)
You can use double quotes to include a single quote in the string or triple quotes to contain a mix of the other two:
blah = [
'banana1',
'banana2',
'banana3',
"ban'ana4",
"""ban'an"a5"""
]

python split string but keep delimiter

In python I can easily read a file line by line into a set, just be using:
file = open("filename.txt", 'r')
content = set(file)
each of the elements in the set consists of the actual line and also the trailing line-break.
Now I have a string with multiple lines, which I want to compare to the content by using the normal set operations.
Is there any way of transforming a string into a set just the same way, such, that it also contains the line-breaks?
Edit:
The question "In Python, how do I split a string and keep the separators?" deals with a similar problem, but the answer doesn't make it easy to adopt to other use-cases.
import re
content = re.split("(\n)", string)
doesn't have the expected effect.
The str.splitlines() method does exactly what you want if you pass True as the optional keepends parameter. It keeps the newlines on the end of each line, and doesn't add one to the last line if there was no newline at the end of the string.
text = "foo\nbar\nbaz"
lines = text.splitlines(True)
print(lines) # prints ['foo\n', 'bar\n', 'baz']
Here's a simple generator that does the job:
content = set(e + "\n" for e in s.split("\n"))
This solution adds an additional newline at the end though.
you can also do it the other way round, remove line endings when reading file lines, assuming you open the file with U for universal line endings:
file = open("filename.txt", 'rU')
content = set(line.rstrip('\n') for line in file)
Could this be what you mean?
>>> from io import StringIO
>>> someLines=StringIO('''\
... line1
... line2
... line3
... ''')
>>> content=set(someLines)
>>> content
{'line1\n', 'line2\n', 'line3\n'}

Appending lines to a file, then reading them

I want to append or write multiple lines to a file. I believe the following code appends one line:
with open(file_path,'a') as file:
file.write('1')
My first question is that if I do this:
with open(file_path,'a') as file:
file.write('1')
file.write('2')
file.write('3')
Will it create a file with the following content?
1
2
3
Second question—if I later do:
with open(file_path,'r') as file:
first = file.read()
second = file.read()
third = file.read()
Will that read the content to the variables so that first will be 1, second will be 2 etc? If not, how do I do it?
Question 1: No.
file.write simple writes whatever you pass to it to the position of the pointer in the file. file.write("Hello "); file.write("World!") will produce a file with contents "Hello World!"
You can write a whole line either by appending a newline character ("\n") to each string to be written, or by using the print function's file keyword argument (which I find to be a bit cleaner)
with open(file_path, 'a') as f:
print('1', file=f)
print('2', file=f)
print('3', file=f)
N.B. print to file doesn't always add a newline, but print itself does by default! print('1', file=f, end='') is identical to f.write('1')
Question 2: No.
file.read() reads the whole file, not one line at a time. In this case you'll get
first == "1\n2\n3"
second == ""
third == ""
This is because after the first call to file.read(), the pointer is set to the end of the file. Subsequent calls try to read from the pointer to the end of the file. Since they're in the same spot, you get an empty string. A better way to do this would be:
with open(file_path, 'r') as f: # `file` is a bad variable name since it shadows the class
lines = f.readlines()
first = lines[0]
second = lines[1]
third = lines[2]
Or:
with open(file_path, 'r') as f:
first, second, third = f.readlines() # fails if there aren't exactly 3 lines
The answer to the first question is no. You're writing individual characters. You would have to read them out individually.
Also, note that file.read() returns the full contents of the file.
If you wrote individual characters and you want to read individual characters, process the result of file.read() as a string.
text = open(file_path).read()
first = text[0]
second = text[1]
third = text[2]
As for the second question, you should write newline characters, '\n', to terminate each line that you write to the file.
with open(file_path, 'w') as out_file:
out_file.write('1\n')
out_file.write('2\n')
out_file.write('3\n')
To read the lines, you can use file.readlines().
lines = open(file_path).readlines()
first = lines[0] # -> '1\n'
second = lines[1] # -> '2\n'
third = lines[2] # -> '3\n'
If you want to get rid of the newline character at the end of each line, use strip(), which discards all whitespace before and after a string. For example:
first = lines[0].strip() # -> '1'
Better yet, you can use map to apply strip() to every line.
lines = list(map(str.strip, open(file_path).readlines()))
first = lines[0] # -> '1'
second = lines[1] # -> '2'
third = lines[2] # -> '3'
Writing multiple lines to a file
This will depend on how the data is stored. For writing individual values, your current example is:
with open(file_path,'a') as file:
file.write('1')
file.write('2')
file.write('3')
The file will contain the following:
123
It will also contain whatever contents it had previously since it was opened to append. To write newlines, you must explicitly add these or use writelines(), which expects an iterable.
Also, I don't recommend using file as an object name since it is a keyword, so I will use f from here on out.
For instance, here is an example where you have a list of values that you write using write() and explicit newline characters:
my_values = ['1', '2', '3']
with open(file_path,'a') as f:
for value in my_values:
f.write(value + '\n')
But a better way would be to use writelines(). To add newlines, you could join them with a list comprehension:
my_values = ['1', '2', '3']
with open(file_path,'a') as f:
f.writelines([value + '\n' for value in my_values])
If you are looking for printing a range of numbers, you could use a for loop with range (or xrange if using Python 2.x and printing a lot of numbers).
Reading individual lines from a file
To read individual lines from a file, you can also use a for loop:
my_list = []
with open(file_path,'r') as f:
for line in f:
my_list.append(line.strip()) # strip out newline characters
This way you can iterate through the lines of the file returned with a for loop (or just process them as you read them, particularly if it's a large file).

Stripping line edings before appending to a list?

Ok I am writing a program that reads text files and goes through the different lines, the problem that I have encountered however is line endings (\n). My aim is to read the text file line by line and write it to a list and remove the line endings before it is appended to the list.
I have tried this:
thelist = []
inputfile = open('text.txt','rU')
for line in inputfile:
line.rstrip()
thelist.append(line)
Strings are immutable in Python. All string methods return new strings, and don't modify the original one, so the line
line.rstrip()
effectively does nothing. You can use a list comprehension to accomplish this:
with open("text.txt", "rU") as f:
lines = [line.rstrip("\n") for line in f]
Also note that it is stringly recommended to use the with statement to open (and implicitly close) files.
with open('text.txt', 'rU') as f: # Use with block to close file on block exit
thelist = [line.rstrip() for line in f]
rstrip doesn't change its argument, it returns modified string, that's why you must write it so:
thelist.append(line.rstrip())
But you can write your code simpler:
with open('text.txt', 'rU') as inputfile:
thelist = [x.rstrip() for x in inputfile]
Use rstrip('\n') on each line before appending to your list.
I think you need something like this.
s = s.strip(' \t\n\r')
This will strip white spaces from both the beginning and the end of you string
In Python - strings are immutable - which means that operations return a new string, and don't modify the existing string. ie, you've got it right, but need to re-assign (or name a new variable) using line = line.rstrip().
rstrip returns a new string. It should be line = line.rstrip(). However, the whole code could be shorter:
thelist = list(map(str.rstrip, open('text.txt','rU')))
UPD: Note that just calling rstrip() trims all trailing whitespace, not just newline. But there is a concise way to do that too:
thelist = open('text.txt','rU').read().splitlines()

Categories