Combined effect of reading lines twice?

Combined effect of reading lines twice? - python

As a practice, I am learning to reading a file.
As is obvious from code, hopefully, I have a file in working/root whatever directory. I need to read it and print it.
my_file=open("new.txt","r")
lengt=sum(1 for line in my_file)
for i in range(0,lengt-1):
myline=my_file.readlines(1)[0]
print(myline)
my_file.close()
This returns error and says out of range.
The text file simply contains statements like
line one
line two
line three
.
.
.
Everything same, I tried myline=my_file.readline(). I get empty 7 lines.
My guess is that while using for line in my_file, I read up the lines. So reached end of document. To get same result as I desire, I do I overcome this?
P.S. if it mattersm it's python 3.3

No need to count along. Python does it for you:
my_file = open("new.txt","r")
for myline in my_file:
print(myline)
Details:
my_file is an iterator. This a special object that allows to iterate over it.
You can also access a single line:
line 1 = next(my_file)
gives you the first line assuming you just opened the file. Doing it again:
line 2 = next(my_file)
you get the second line. If you now iterate over it:
for myline in my_file:
# do something
it will start at line 3.
Stange extra lines?
print(myline)
will likely print an extra empty line. This is due to a newline read from the file and a newline added by print(). Solution:
Python 3:
print(myline, end='')
Python 2:
print myline, # note the trailing comma.
Playing it save
Using the with statement like this:
with open("new.txt", "r") as my_file:
for myline in my_file:
print(myline)
# my_file is open here
# my_file is closed here
you don't need to close the file as it done as soon you leave the context, i.e. as soon as you continue with your code an the same level as the with statement.

You can actually take care of all of this at once by iterating over the file contents:
my_file = open("new.txt", "r")
length = 0
for line in my_file:
length += 1
print(line)
my_file.close()
At the end, you will have printed all of the lines, and length will contain the number of lines in the file. (If you don't specifically need to know length, there's really no need for it!)
Another way to do it, which will close the file for you (and, in fact, will even close the file if an exception is raised):
length = 0
with open("new.txt", "r") as my_file:
for line in my_file:
length += 1
print(line)

Related

how to read data line by line in python

I have a textfile which is filled with data like follows:
#n
44026533495303941500076737402297403862946691
#e
6969696
#f
37243759787836627691897628674719248256836857
In the end, I want to know the numbers saved with the variables n, e, f
I tried to read it line by line, but the datastream gives me only letter by letter
My code was following:
file = open(sys.argv[2]).read() # for getting file
for line in file:
print(line) # but it gives letter for letter
My idea was to take for example
n = file[1]
e = file[5]

Close, but no cigar.
You should get rid of the .read(), it reads the whole file.
Here is what you were probably aiming for...
file = open(sys.argv[2]) # no .read() please
for line in file:
print(line) # now it gives the line
file.close() # don't forget to release the resource!
... but this is what you really want
with open(sys.argv[2], 'r') as input_file:
for line in input_file:
print(line)
By using the with keyword, you don't have to remember to close the resource! (here's a tutorial on it).
Also if you specify 'r' in the open, it's a little more obvious what you intend to do with the file. Not crucial but recommended.

Strange characters in the begining of the file after writing in Python

I want to do a lot of boring C# code replacements automatically through a python script. I read all lines of the file, transform them, truncate the whole file, write new strings and close it.
f = open(file, 'r+')
text = f.readlines()
# some changes
f.truncate(0)
for line in text:
f.write(line)
f.close()
All my changes are written. But some strange characters in the beginning of the file appear. I don't know how to avoid them. Even if I open with encoding='utf-8-sig' it doesn't help.
I tried truncate whole file besides the 1st line like this:
import sys
f.truncate(sys.getsizeof(text[0]))
for index in range(1, len(text), 1):
f.write(text[index])
But in this case more than 1st line is writing instead of only first line.
EDIT
I tried this:
f.truncate(len(text[0]))
for index in range(1, len(text), 1):
f.write(text[index])
And the first line has written correct but next one with the same issue. So I think this characters from the end of the file and I try to write after them.

f=open(file, 'r+')
text = f.readlines() # After reading all the lines, the pointer is at the end of the file.
# some changes
f.seek(0) # To bring the pointer back to the starting of the file.
f.truncate() # Don't pass any value in truncate() as it means number of bytes to be truncated by default size of file.
for line in text:
f.write(line)
f.close()
Check out this Link for more details.

Python: Open a file, search then append, if not exist

I am trying to append a string to a file, if the string doesn't exit in the file. However, opening a file with a+ option doesn't allow me to do at once, because opening the file with a+ will put the pointer to the end of the file, meaning that my search will always fail. Is there any good way to do this other than opening the file to read first, close and open again to append?
In code, apparently, below doesn't work.
file = open("fileName", "a+")
I need to do following to achieve it.
file = open("fileName", "r")
... check if a string exist in the file
file.close()
... if the string doesn't exist in the file
file = open("fileName", "a")
file.write("a string")
file.close()

To leave the input file unchanged if needle is on any line or to append the needle at the end of the file if it is missing:
with open("filename", "r+") as file:
for line in file:
if needle in line:
break
else: # not found, we are at the eof
file.write(needle) # append missing data
I've tested it and it works on both Python 2 (stdio-based I/O) and Python 3 (POSIX read/write-based I/O).
The code uses obscure else after a loop Python syntax. See Why does python use 'else' after for and while loops?

You can set the current position of the file object using file.seek(). To jump to the beginning of a file, use
f.seek(0, os.SEEK_SET)
To jump to a file's end, use
f.seek(0, os.SEEK_END)
In your case, to check if a file contains something, and then maybe append append to the file, I'd do something like this:
import os
with open("file.txt", "r+") as f:
line_found = any("foo" in line for line in f)
if not line_found:
f.seek(0, os.SEEK_END)
f.write("yay, a new line!\n")

There is a minor bug in the previous answers: often, the last line in a text file is missing an ending newline. If you do not take that that into account and blindly append some text, your text will be appended to the last line.
For safety:
needle = "Add this line if missing"
with open("filename", "r+") as file:
ends_with_newline = True
for line in file:
ends_with_newline = line.endswith("\n")
if line.rstrip("\n\r") == needle:
break
else: # not found, we are at the eof
if not ends_with_newline:
file.write("\n")
file.write(needle + "\n") # append missing data

Remove whitespaces in the beginning of every string in a file in python?

How to remove whitespaces in the beginning of every string in a file with python?
I have a file myfile.txt with the strings as shown below in it:
_ _ Amazon.inc
Arab emirates
_ Zynga
Anglo-Indian
Those underscores are spaces.
The code must be in a way that it must go through each and every line of a file and remove all those whitespaces, in the beginning of a line.
I've tried using lstrip but that's not working for multiple lines and readlines() too.
Using a for loop can make it better?

All you need to do is read the lines of the file one by one and remove the leading whitespace for each line. After that, you can join again the lines and you'll get back the original text without the whitespace:
with open('myfile.txt') as f:
line_lst = [line.lstrip() for line in f.readlines()]
lines = ''.join(line_lst)
print lines

Assuming that your input data is in infile.txt, and you want to write this file to output.txt, it is easiest to use a list comprehension:
inf = open("infile.txt")
stripped_lines = [l.lstrip() for l in inf.readlines()]
inf.close()
# write the new, stripped lines to a file
outf = open("output.txt", "w")
outf.write("".join(stripped_lines))
outf.close()

To read the lines from myfile.txt and write them to output.txt, use
with open("myfile.txt") as input:
with open("output.txt", "w") as output:
for line in input:
output.write(line.lstrip())
That will make sure that you close the files after you're done with them, and it'll make sure that you only keep a single line in memory at a time.
The above code works in Python 2.5 and later because of the with keyword. For Python 2.4 you can use
input = open("myfile.txt")
output = open("output.txt", "w")
for line in input:
output.write(line.lstrip())
if this is just a small script where the files will be closed automatically at the end. If this is part of a larger program, then you'll want to explicitly close the files like this:
input = open("myfile.txt")
try:
output = open("output.txt", "w")
try:
for line in input:
output.write(line.lstrip())
finally:
output.close()
finally:
input.close()
You say you already tried with lstrip and that it didn't work for multiple lines. The "trick" is to run lstrip on each individual line line I do above. You can try the code out online if you want.

How to delete parts of a file in python?

I have a file named a.txt which looks like this:
I'm the first line
I'm the second line.
There may be more lines here.
I'm below an empty line.
I'm a line.
More lines here.
Now, I want to remove the contents above the empty line(including the empty line itself).
How could I do this in a Pythonic way?

Basically you can't delete stuff from the beginning of a file, so you will have to write to a new file.
I think the pythonic way looks like this:
# get a iterator over the lines in the file:
with open("input.txt", 'rt') as lines:
# while the line is not empty drop it
for line in lines:
if not line.strip():
break
# now lines is at the point after the first paragraph
# so write out everything from here
with open("output.txt", 'wt') as out:
out.writelines(lines)
Here are some simpler versions of this, without with for older Python versions:
lines = open("input.txt", 'rt')
for line in lines:
if not line.strip():
break
open("output.txt", 'wt').writelines(lines)
and a very straight forward version that simply splits the file at the empty line:
# first, read everything from the old file
text = open("input.txt", 'rt').read()
# split it at the first empty line ("\n\n")
first, rest = text.split('\n\n',1)
# make a new file and write the rest
open("output.txt", 'wt').write(rest)
Note that this can be pretty fragile, for example windows often uses \r\n as a single linebreak, so a empty line would be \r\n\r\n instead. But often you know the format of the file uses one kind of linebreaks only, so this could be fine.

Naive approach by iterating over the lines in the file one by one top to bottom:
#!/usr/bin/env python
with open("4692065.txt", 'r') as src, open("4692065.cut.txt", "w") as dest:
keep = False
for line in src:
if keep: dest.write(line)
if line.strip() == '': keep = True

The fileinput module (from the standard library) is convenient for this kind of thing. It sets things up so you can act as though your are editing the file "in-place":
import fileinput
import sys
fileobj=iter(fileinput.input(['a.txt'], inplace=True))
# iterate through the file until you find an empty line.
for line in fileobj:
if not line.strip():
break
# Iterators (like `fileobj`) pick up where they left off.
# Starting a new for-loop saves you one `if` statement and boolean variable.
for line in fileobj:
sys.stdout.write(line)

Any idea how big the file is going to be?
You could read the file into memory:
f = open('your_file', 'r')
lines = f.readlines()
which will read the file line by line and store those lines in a list (lines).
Then, close the file and reopen with 'w':
f.close()
f = open('your_file', 'w')
for line in lines:
if your_if_here:
f.write(line)
This will overwrite the current file. Then you can pick and choose which lines from the list you want to write back in. Probably not a very good idea if the file gets to large though, since the entire file has to reside in memory. But, it doesn't require that you create a second file to dump your output.

from itertools import dropwhile, islice
def content_after_emptyline(file_object):
return islice(dropwhile(lambda line: line.strip(), file_object), 1, None)
with open("filename") as f:
for line in content_after_emptyline(f):
print line,

You could do a little something like this:
with open('a.txt', 'r') as file:
lines = file.readlines()
blank_line = lines.index('\n')
lines = lines[blank_line+1:] #\n is the index of the blank line
with open('a.txt', 'w') as file:
file.write('\n'.join(lines))
and that makes the job much simpler.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Combined effect of reading lines twice? - python

Related

how to read data line by line in python

Strange characters in the begining of the file after writing in Python

Python: Open a file, search then append, if not exist

Remove whitespaces in the beginning of every string in a file in python?

How to delete parts of a file in python?

Categories

Resources