Parsing the text in a file

Parsing the text in a file - python

I am trying print the contents of a text file using below code,the text file will only contain one line,my current output is as below which
is a list with "\r\n" at the end,I want to the output to be as shown in "EXPECTED OUTPUT" ?
branch_textfile = branch + '.txt'
with open(branch_textfile, 'r') as f: #open the file
contents = f.readlines() #put the lines to a variable (list).
#contents = contents.rstrip()
print contents
CURRENT OUTPUT:-
['AU_TEST_PUBLIC_BRANCH.05.01.01.151.005\r\n']
EXPECTED OUTPUT:-
AU_TEST_PUBLIC_BRANCH.05.01.01.151.005

>>> x = ['AU_TEST_PUBLIC_BRANCH.05.01.01.151.005\r\n']
>>> print x[0].rstrip()
AU_TEST_PUBLIC_BRANCH.05.01.01.151.005
>>>

It does that because f.readlines() returns an array (or is it a list?) Either way, you can avoid the brackets with:
print contents[0]
This works but it only prints the first line of the file.

Use contents = f.read().rstrip() instead of contents = f.readlines(). This will read the file into a single string and remove the trailing whitespace.

why did you "#" the .rstrip() out? it is the right command!
you can also put that on the end of the statment like this:
with open('file','r') as f:
data = f.read().strip()

Related

Deleting line from file by containing a specific string is not working in python

I'm trying to delete each line which contains "annote = {" but my code is not working.
I have a file stored in a variable myFile and I want to go through every line of this file and delete every line which contains the string annote.
this is basically my code:
print(myFile.read()) //prints myFile
myFile.seek(0)
for lines in myFile:
if b"annote = {" in lines:
lines = lines.replace(b'.', b'')
myFile.seek(0)
print(myFile.read()) //this prints the exact same as the print method above so annote lines
//haven't been removed from this file
I have no idea why annote lines doesn't get removed. There is probably anything wrong with the replace method because it always is inside the if request but nothing happens with annote lines. I've also tried lines.replace(b'.', b'') instead of lines = lines.replace(b'.', b'') but nothing happened.
Hope anyone can help me with this problem

This will do it for you.
f.readlines() returns a list of text lines
Then you check for the lines that do not contain the things you do not want
Then you write them to a separate new file.
f2 = open('newtextfile.txt', 'w')
with open('text_remove_line.txt', 'r') as f:
for line in f.readlines():
if 'annote = {' not in line:
f2.write(line)
f2.close()

This should work:
with open('input.txt') as fin :
lines = fin.read().split('\n') # read the text, split into the lines
with open('output.txt', 'w') as fout :
# write out only the lines that does not contain the text 'annote = {'
fout.write( '\n'.join( [i for i in lines if i.find('annote = {') == -1] ))

Send keylogger log files to e-mail [duplicate]

I have a text file that looks like:
ABC
DEF
How can I read the file into a single-line string without newlines, in this case creating a string 'ABCDEF'?
For reading the file into a list of lines, but removing the trailing newline character from each line, see How to read a file without newlines?.

You could use:
with open('data.txt', 'r') as file:
data = file.read().replace('\n', '')
Or if the file content is guaranteed to be one-line
with open('data.txt', 'r') as file:
data = file.read().rstrip()

In Python 3.5 or later, using pathlib you can copy text file contents into a variable and close the file in one line:
from pathlib import Path
txt = Path('data.txt').read_text()
and then you can use str.replace to remove the newlines:
txt = txt.replace('\n', '')

You can read from a file in one line:
str = open('very_Important.txt', 'r').read()
Please note that this does not close the file explicitly.
CPython will close the file when it exits as part of the garbage collection.
But other python implementations won't. To write portable code, it is better to use with or close the file explicitly. Short is not always better. See https://stackoverflow.com/a/7396043/362951

To join all lines into a string and remove new lines, I normally use :
with open('t.txt') as f:
s = " ".join([l.rstrip("\n") for l in f])

with open("data.txt") as myfile:
data="".join(line.rstrip() for line in myfile)
join() will join a list of strings, and rstrip() with no arguments will trim whitespace, including newlines, from the end of strings.

This can be done using the read() method :
text_as_string = open('Your_Text_File.txt', 'r').read()
Or as the default mode itself is 'r' (read) so simply use,
text_as_string = open('Your_Text_File.txt').read()

I'm surprised nobody mentioned splitlines() yet.
with open ("data.txt", "r") as myfile:
data = myfile.read().splitlines()
Variable data is now a list that looks like this when printed:
['LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN', 'GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE']
Note there are no newlines (\n).
At that point, it sounds like you want to print back the lines to console, which you can achieve with a for loop:
for line in data:
print(line)

It's hard to tell exactly what you're after, but something like this should get you started:
with open ("data.txt", "r") as myfile:
data = ' '.join([line.replace('\n', '') for line in myfile.readlines()])

I have fiddled around with this for a while and have prefer to use use read in combination with rstrip. Without rstrip("\n"), Python adds a newline to the end of the string, which in most cases is not very useful.
with open("myfile.txt") as f:
file_content = f.read().rstrip("\n")
print(file_content)

Here are four codes for you to choose one:
with open("my_text_file.txt", "r") as file:
data = file.read().replace("\n", "")
or
with open("my_text_file.txt", "r") as file:
data = "".join(file.read().split("\n"))
or
with open("my_text_file.txt", "r") as file:
data = "".join(file.read().splitlines())
or
with open("my_text_file.txt", "r") as file:
data = "".join([line for line in file])

you can compress this into one into two lines of code!!!
content = open('filepath','r').read().replace('\n',' ')
print(content)
if your file reads:
hello how are you?
who are you?
blank blank
python output
hello how are you? who are you? blank blank

You can also strip each line and concatenate into a final string.
myfile = open("data.txt","r")
data = ""
lines = myfile.readlines()
for line in lines:
data = data + line.strip();
This would also work out just fine.

This is a one line, copy-pasteable solution that also closes the file object:
_ = open('data.txt', 'r'); data = _.read(); _.close()

f = open('data.txt','r')
string = ""
while 1:
line = f.readline()
if not line:break
string += line
f.close()
print(string)

python3: Google "list comprehension" if the square bracket syntax is new to you.
with open('data.txt') as f:
lines = [ line.strip('\n') for line in list(f) ]

Oneliner:
List: "".join([line.rstrip('\n') for line in open('file.txt')])
Generator: "".join((line.rstrip('\n') for line in open('file.txt')))
List is faster than generator but heavier on memory. Generators are slower than lists and is lighter for memory like iterating over lines. In case of "".join(), I think both should work well. .join() function should be removed to get list or generator respectively.
Note: close() / closing of file descriptor probably not needed

Have you tried this?
x = "yourfilename.txt"
y = open(x, 'r').read()
print(y)

To remove line breaks using Python you can use replace function of a string.
This example removes all 3 types of line breaks:
my_string = open('lala.json').read()
print(my_string)
my_string = my_string.replace("\r","").replace("\n","")
print(my_string)
Example file is:
{
"lala": "lulu",
"foo": "bar"
}
You can try it using this replay scenario:
https://repl.it/repls/AnnualJointHardware

I don't feel that anyone addressed the [ ] part of your question. When you read each line into your variable, because there were multiple lines before you replaced the \n with '' you ended up creating a list. If you have a variable of x and print it out just by
x
or print(x)
or str(x)
You will see the entire list with the brackets. If you call each element of the (array of sorts)
x[0]
then it omits the brackets. If you use the str() function you will see just the data and not the '' either.
str(x[0])

Maybe you could try this? I use this in my programs.
Data= open ('data.txt', 'r')
data = Data.readlines()
for i in range(len(data)):
data[i] = data[i].strip()+ ' '
data = ''.join(data).strip()

Regular expression works too:
import re
with open("depression.txt") as f:
l = re.split(' ', re.sub('\n',' ', f.read()))[:-1]
print (l)
['I', 'feel', 'empty', 'and', 'dead', 'inside']

with open('data.txt', 'r') as file:
data = [line.strip('\n') for line in file.readlines()]
data = ''.join(data)

from pathlib import Path
line_lst = Path("to/the/file.txt").read_text().splitlines()
Is the best way to get all the lines of a file, the '\n' are already stripped by the splitlines() (which smartly recognize win/mac/unix lines types).
But if nonetheless you want to strip each lines:
line_lst = [line.strip() for line in txt = Path("to/the/file.txt").read_text().splitlines()]
strip() was just a useful exemple, but you can process your line as you please.
At the end, you just want concatenated text ?
txt = ''.join(Path("to/the/file.txt").read_text().splitlines())

This works:
Change your file to:
LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE
Then:
file = open("file.txt")
line = file.read()
words = line.split()
This creates a list named words that equals:
['LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN', 'GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE']
That got rid of the "\n". To answer the part about the brackets getting in your way, just do this:
for word in words: # Assuming words is the list above
print word # Prints each word in file on a different line
Or:
print words[0] + ",", words[1] # Note that the "+" symbol indicates no spaces
#The comma not in parentheses indicates a space
This returns:
LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN, GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE

with open(player_name, 'r') as myfile:
data=myfile.readline()
list=data.split(" ")
word=list[0]
This code will help you to read the first line and then using the list and split option you can convert the first line word separated by space to be stored in a list.
Than you can easily access any word, or even store it in a string.
You can also do the same thing with using a for loop.

file = open("myfile.txt", "r")
lines = file.readlines()
str = '' #string declaration
for i in range(len(lines)):
str += lines[i].rstrip('\n') + ' '
print str

Try the following:
with open('data.txt', 'r') as myfile:
data = myfile.read()
sentences = data.split('\\n')
for sentence in sentences:
print(sentence)
Caution: It does not remove the \n. It is just for viewing the text as if there were no \n

reading in file python says its a string [duplicate]

I have a text file that looks like:
ABC
DEF
How can I read the file into a single-line string without newlines, in this case creating a string 'ABCDEF'?
For reading the file into a list of lines, but removing the trailing newline character from each line, see How to read a file without newlines?.

You could use:
with open('data.txt', 'r') as file:
data = file.read().replace('\n', '')
Or if the file content is guaranteed to be one-line
with open('data.txt', 'r') as file:
data = file.read().rstrip()

In Python 3.5 or later, using pathlib you can copy text file contents into a variable and close the file in one line:
from pathlib import Path
txt = Path('data.txt').read_text()
and then you can use str.replace to remove the newlines:
txt = txt.replace('\n', '')

You can read from a file in one line:
str = open('very_Important.txt', 'r').read()
Please note that this does not close the file explicitly.
CPython will close the file when it exits as part of the garbage collection.
But other python implementations won't. To write portable code, it is better to use with or close the file explicitly. Short is not always better. See https://stackoverflow.com/a/7396043/362951

To join all lines into a string and remove new lines, I normally use :
with open('t.txt') as f:
s = " ".join([l.rstrip("\n") for l in f])

with open("data.txt") as myfile:
data="".join(line.rstrip() for line in myfile)
join() will join a list of strings, and rstrip() with no arguments will trim whitespace, including newlines, from the end of strings.

This can be done using the read() method :
text_as_string = open('Your_Text_File.txt', 'r').read()
Or as the default mode itself is 'r' (read) so simply use,
text_as_string = open('Your_Text_File.txt').read()

I'm surprised nobody mentioned splitlines() yet.
with open ("data.txt", "r") as myfile:
data = myfile.read().splitlines()
Variable data is now a list that looks like this when printed:
['LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN', 'GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE']
Note there are no newlines (\n).
At that point, it sounds like you want to print back the lines to console, which you can achieve with a for loop:
for line in data:
print(line)

It's hard to tell exactly what you're after, but something like this should get you started:
with open ("data.txt", "r") as myfile:
data = ' '.join([line.replace('\n', '') for line in myfile.readlines()])

I have fiddled around with this for a while and have prefer to use use read in combination with rstrip. Without rstrip("\n"), Python adds a newline to the end of the string, which in most cases is not very useful.
with open("myfile.txt") as f:
file_content = f.read().rstrip("\n")
print(file_content)

Here are four codes for you to choose one:
with open("my_text_file.txt", "r") as file:
data = file.read().replace("\n", "")
or
with open("my_text_file.txt", "r") as file:
data = "".join(file.read().split("\n"))
or
with open("my_text_file.txt", "r") as file:
data = "".join(file.read().splitlines())
or
with open("my_text_file.txt", "r") as file:
data = "".join([line for line in file])

you can compress this into one into two lines of code!!!
content = open('filepath','r').read().replace('\n',' ')
print(content)
if your file reads:
hello how are you?
who are you?
blank blank
python output
hello how are you? who are you? blank blank

You can also strip each line and concatenate into a final string.
myfile = open("data.txt","r")
data = ""
lines = myfile.readlines()
for line in lines:
data = data + line.strip();
This would also work out just fine.

This is a one line, copy-pasteable solution that also closes the file object:
_ = open('data.txt', 'r'); data = _.read(); _.close()

f = open('data.txt','r')
string = ""
while 1:
line = f.readline()
if not line:break
string += line
f.close()
print(string)

python3: Google "list comprehension" if the square bracket syntax is new to you.
with open('data.txt') as f:
lines = [ line.strip('\n') for line in list(f) ]

Oneliner:
List: "".join([line.rstrip('\n') for line in open('file.txt')])
Generator: "".join((line.rstrip('\n') for line in open('file.txt')))
List is faster than generator but heavier on memory. Generators are slower than lists and is lighter for memory like iterating over lines. In case of "".join(), I think both should work well. .join() function should be removed to get list or generator respectively.
Note: close() / closing of file descriptor probably not needed

Have you tried this?
x = "yourfilename.txt"
y = open(x, 'r').read()
print(y)

To remove line breaks using Python you can use replace function of a string.
This example removes all 3 types of line breaks:
my_string = open('lala.json').read()
print(my_string)
my_string = my_string.replace("\r","").replace("\n","")
print(my_string)
Example file is:
{
"lala": "lulu",
"foo": "bar"
}
You can try it using this replay scenario:
https://repl.it/repls/AnnualJointHardware

I don't feel that anyone addressed the [ ] part of your question. When you read each line into your variable, because there were multiple lines before you replaced the \n with '' you ended up creating a list. If you have a variable of x and print it out just by
x
or print(x)
or str(x)
You will see the entire list with the brackets. If you call each element of the (array of sorts)
x[0]
then it omits the brackets. If you use the str() function you will see just the data and not the '' either.
str(x[0])

Maybe you could try this? I use this in my programs.
Data= open ('data.txt', 'r')
data = Data.readlines()
for i in range(len(data)):
data[i] = data[i].strip()+ ' '
data = ''.join(data).strip()

Regular expression works too:
import re
with open("depression.txt") as f:
l = re.split(' ', re.sub('\n',' ', f.read()))[:-1]
print (l)
['I', 'feel', 'empty', 'and', 'dead', 'inside']

with open('data.txt', 'r') as file:
data = [line.strip('\n') for line in file.readlines()]
data = ''.join(data)

from pathlib import Path
line_lst = Path("to/the/file.txt").read_text().splitlines()
Is the best way to get all the lines of a file, the '\n' are already stripped by the splitlines() (which smartly recognize win/mac/unix lines types).
But if nonetheless you want to strip each lines:
line_lst = [line.strip() for line in txt = Path("to/the/file.txt").read_text().splitlines()]
strip() was just a useful exemple, but you can process your line as you please.
At the end, you just want concatenated text ?
txt = ''.join(Path("to/the/file.txt").read_text().splitlines())

This works:
Change your file to:
LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE
Then:
file = open("file.txt")
line = file.read()
words = line.split()
This creates a list named words that equals:
['LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN', 'GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE']
That got rid of the "\n". To answer the part about the brackets getting in your way, just do this:
for word in words: # Assuming words is the list above
print word # Prints each word in file on a different line
Or:
print words[0] + ",", words[1] # Note that the "+" symbol indicates no spaces
#The comma not in parentheses indicates a space
This returns:
LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN, GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE

with open(player_name, 'r') as myfile:
data=myfile.readline()
list=data.split(" ")
word=list[0]
This code will help you to read the first line and then using the list and split option you can convert the first line word separated by space to be stored in a list.
Than you can easily access any word, or even store it in a string.
You can also do the same thing with using a for loop.

file = open("myfile.txt", "r")
lines = file.readlines()
str = '' #string declaration
for i in range(len(lines)):
str += lines[i].rstrip('\n') + ' '
print str

Try the following:
with open('data.txt', 'r') as myfile:
data = myfile.read()
sentences = data.split('\\n')
for sentence in sentences:
print(sentence)
Caution: It does not remove the \n. It is just for viewing the text as if there were no \n

Appending lines to a file, then reading them

I want to append or write multiple lines to a file. I believe the following code appends one line:
with open(file_path,'a') as file:
file.write('1')
My first question is that if I do this:
with open(file_path,'a') as file:
file.write('1')
file.write('2')
file.write('3')
Will it create a file with the following content?
1
2
3
Second question—if I later do:
with open(file_path,'r') as file:
first = file.read()
second = file.read()
third = file.read()
Will that read the content to the variables so that first will be 1, second will be 2 etc? If not, how do I do it?

Question 1: No.
file.write simple writes whatever you pass to it to the position of the pointer in the file. file.write("Hello "); file.write("World!") will produce a file with contents "Hello World!"
You can write a whole line either by appending a newline character ("\n") to each string to be written, or by using the print function's file keyword argument (which I find to be a bit cleaner)
with open(file_path, 'a') as f:
print('1', file=f)
print('2', file=f)
print('3', file=f)
N.B. print to file doesn't always add a newline, but print itself does by default! print('1', file=f, end='') is identical to f.write('1')
Question 2: No.
file.read() reads the whole file, not one line at a time. In this case you'll get
first == "1\n2\n3"
second == ""
third == ""
This is because after the first call to file.read(), the pointer is set to the end of the file. Subsequent calls try to read from the pointer to the end of the file. Since they're in the same spot, you get an empty string. A better way to do this would be:
with open(file_path, 'r') as f: # `file` is a bad variable name since it shadows the class
lines = f.readlines()
first = lines[0]
second = lines[1]
third = lines[2]
Or:
with open(file_path, 'r') as f:
first, second, third = f.readlines() # fails if there aren't exactly 3 lines

The answer to the first question is no. You're writing individual characters. You would have to read them out individually.
Also, note that file.read() returns the full contents of the file.
If you wrote individual characters and you want to read individual characters, process the result of file.read() as a string.
text = open(file_path).read()
first = text[0]
second = text[1]
third = text[2]
As for the second question, you should write newline characters, '\n', to terminate each line that you write to the file.
with open(file_path, 'w') as out_file:
out_file.write('1\n')
out_file.write('2\n')
out_file.write('3\n')
To read the lines, you can use file.readlines().
lines = open(file_path).readlines()
first = lines[0] # -> '1\n'
second = lines[1] # -> '2\n'
third = lines[2] # -> '3\n'
If you want to get rid of the newline character at the end of each line, use strip(), which discards all whitespace before and after a string. For example:
first = lines[0].strip() # -> '1'
Better yet, you can use map to apply strip() to every line.
lines = list(map(str.strip, open(file_path).readlines()))
first = lines[0] # -> '1'
second = lines[1] # -> '2'
third = lines[2] # -> '3'

Writing multiple lines to a file
This will depend on how the data is stored. For writing individual values, your current example is:
with open(file_path,'a') as file:
file.write('1')
file.write('2')
file.write('3')
The file will contain the following:
123
It will also contain whatever contents it had previously since it was opened to append. To write newlines, you must explicitly add these or use writelines(), which expects an iterable.
Also, I don't recommend using file as an object name since it is a keyword, so I will use f from here on out.
For instance, here is an example where you have a list of values that you write using write() and explicit newline characters:
my_values = ['1', '2', '3']
with open(file_path,'a') as f:
for value in my_values:
f.write(value + '\n')
But a better way would be to use writelines(). To add newlines, you could join them with a list comprehension:
my_values = ['1', '2', '3']
with open(file_path,'a') as f:
f.writelines([value + '\n' for value in my_values])
If you are looking for printing a range of numbers, you could use a for loop with range (or xrange if using Python 2.x and printing a lot of numbers).
Reading individual lines from a file
To read individual lines from a file, you can also use a for loop:
my_list = []
with open(file_path,'r') as f:
for line in f:
my_list.append(line.strip()) # strip out newline characters
This way you can iterate through the lines of the file returned with a for loop (or just process them as you read them, particularly if it's a large file).

Python Insert text before a specific line

I want to insert a text specifically before a line 'Number'.
I want to insert 'Hello Everyone' befor the line starting with 'Number'
My code:
import re
result = []
with open("text2.txt", "r+") as f:
a = [x.rstrip() for x in f] # stores all lines from f into an array and removes "\n"
# Find the first occurance of "Centre" and store its index
for item in a:
if item.startswith("Number"): # same as your re check
break
ind = a.index(item) #here it produces index no./line no.
result.extend(a[:ind])
f.write('Hello Everyone')
tEXT FILE:
QWEW
RW
...
Number hey
Number ho
Expected output:
QWEW
RW
...
Hello Everyone
Number hey
Number ho
Please help me to fix my code:I dont get anything inserted with my text file!Please help!
Answers will be appreciated!

The problem
When you do open("text2.txt", "r"), you open your file for reading, not for writing. Therefore, nothing appears in your file.
The fix
Using r+ instead of r allows you to also write to the file (this was also pointed out in the comments. However, it overwrites, so be careful (this is an OS limitation, as described e.g. here). The following should do what you desire: It inserts "Hello everyone" into the list of lines and then overwrites the file with the updated lines.
with open("text2.txt", "r+") as f:
a = [x.rstrip() for x in f]
index = 0
for item in a:
if item.startswith("Number"):
a.insert(index, "Hello everyone") # Inserts "Hello everyone" into `a`
break
index += 1
# Go to start of file and clear it
f.seek(0)
f.truncate()
# Write each line back
for line in a:
f.write(line + "\n")

The correct answer to your problem is the hlt one, but consider also using the fileinput module:
import fileinput
found = False
for line in fileinput.input('DATA', inplace=True):
if not found and line.startswith('Number'):
print 'Hello everyone'
found = True
print line,

This is basically the same question as here: they propose to do it in three steps: read everything / insert / rewrite everything
with open("/tmp/text2.txt", "r") as f:
lines = f.readlines()
for index, line in enumerate(lines):
if line.startswith("Number"):
break
lines.insert(index, "Hello everyone !\n")
with open("/tmp/text2.txt", "w") as f:
contents = f.writelines(lines)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Parsing the text in a file - python

>>> x = ['AU_TEST_PUBLIC_BRANCH.05.01.01.151.005\r\n'] >>> print x[0].rstrip() AU_TEST_PUBLIC_BRANCH.05.01.01.151.005 >>>

It does that because f.readlines() returns an array (or is it a list?) Either way, you can avoid the brackets with: print contents[0] This works but it only prints the first line of the file.

Use contents = f.read().rstrip() instead of contents = f.readlines(). This will read the file into a single string and remove the trailing whitespace.

why did you "#" the .rstrip() out? it is the right command! you can also put that on the end of the statment like this: with open('file','r') as f: data = f.read().strip()

Related

Deleting line from file by containing a specific string is not working in python

Send keylogger log files to e-mail [duplicate]

reading in file python says its a string [duplicate]

Appending lines to a file, then reading them

Python Insert text before a specific line

Categories

Resources