This question already has answers here:
Deleting a line in a file
(4 answers)
Closed 4 years ago.
I want to read one text file line by line and delete the line. So steps are like below.
Read the line
Delete the line
Repeat step 1 and 2 until the file is not empty
For example data in file is like below.
1,2,3,4,5
a,b,c,d,e
q,w,e,r,t
a,s,d,f,g
So, steps will be ...
while Reading
Read 1,2,3,4,5
Delete 1,2,3,4,5
then next line
Read a,b,c,d,e
Delete a,b,c,d,e
..
..
and so on until the file is not empty.
I know Python well, but I am not able to do it.
Don't even try!
At its lowest level, a file is a sequential stream of bytes. You can read bytes from that stream, overwrite bytes, and position your cursor (position where to read of write) anywhere between the beginning and the end of a file. When you are at end of file and write, you extend the file. You can also truncate a file (remove everything past the cursor).
So there is no way to remove the initial part of a file! It is commonly done by writing a new file containing what needs to be kept remove the original file and rename the new file to give it the original name.
I assume that you are trying the wrong tool to solve your real problem...
Related
This question already has answers here:
Python seek to read constantly growing files
(1 answer)
How can I tail a log file in Python?
(14 answers)
Closed 2 years ago.
GOAL
I have a text file which receives receives at least 20 more lines every seconds.
What I want is that every time a new line appears in my txt file, I want the line to go through a process. For example, the line will be uppercase and print it out. And then wait for the next line to appear.
WHAT I TRIED
My idea was to read the file like we can normally read a file line by line in python because by the time it was reading the lines, some more lines would be added. However, python reads the lines too fast...
This was my code:
file = open('/file/path/text.txt')
for line in file:
line = line.upper
print(line)
time.sleep(1)
I know there is PySpark, but is there an easier solution? Because I only have one text file and it must go through a simple process (eg: upper case read lines, print line and then wait for next line to appear).
I am a newbie in python so I am sorry if the answer is obvious for some of you. I would appreciate any help. Thank you.
This question already has answers here:
How should I read a file line-by-line in Python?
(3 answers)
Closed 3 years ago.
I have a very big file (~10GB) and I want to read it in its wholeness. In order to achieve this, I cut it into chunks. However, I have troubles cutting the big file into exploitable pieces: I want thousands lines together without having them splitted in the middle. I have found a function here on SO that I have arranged a bit:
def readPieces(file):
while True:
data = file.read(4096).strip()
if not data:
break
yield data
with open('bigfile.txt', 'r') as f:
for chunk in readPieces(f):
print(chunk)
I can specify the bytes I want to read (here 4MB) but when I do so my lines get cut in the middle, and if I remove it, it'll read the big file that will lead to a process stop. How can I do this?
Also, the lines in my file haven't equal size.
The following code reads the file line by line, the previous line gets garbage collected.
with open('bigfile.txt') as file:
for line in file:
print(line)
This question already has answers here:
How to append a new row to an old CSV file in Python?
(8 answers)
Closed 4 years ago.
How does one edit specific lines of a file?
Example
What's your name?
What do you do?
What's your favourite colour?
And then after running a specific program, the outcome will be..
What's your name? : Thatile
What do you do? : I am a student
What's your favourite colour? : Black
Using file.seek() overwrites and I want to keep the original text in the file, just edit lines
A simple solution (though not memory efficient) is to read the whole file to memory (using file.read() or file.readline() in a loop), edit the data you obtained to append the answers, then write the modified data to the original file (overwriting its original content).
Again, this is not memory efficient and could take a long time on a large file.
Short answer: you cannot "edit a specific line in a file", you have to read the whole file and overwrite it with the edited content. The safe and memory-efficient way is to
open a temporary file for writing
open the original file for reading
loop over the original file's content and for each line, update the line with your new content and write it to the temporary file
close all files and replace the original with the temporary one
I have a python script that is supposed to read a file. The issue is that that file is very large so for efficiency I decided that my script should only read from line 650000 and onward, since previous line does not contain relevant information.
Is there any way to only modify lines 650000 till eof, so for example, if i read() this file only those specific lines would appear?
Files are not line-oriented, they are blocks of bytes.
There's no way, short of reading the data in, to figure out how many bytes make up those first 650,000 lines, so you'd have to do that just in order to skip them.
Starting modifying a file at a certain offset is possible, but that offset will be in bytes which is the addressing unit used by files.
Skipping lines can be done easily enough:
with open("myfile.txt", "w+t") as f:
for i in xrange(650000):
f.readline() # Read a line and throw it away
f.write("hello")
This will truncate the file so that there will be no data after the hello (but 650,000 lines before it, of course).
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Fastest Way to Delete a Line from Large File in Python
How to edit a line in middle of txt file without overwriting everything?
I know I can read every line into a list, remove a line, then write the list back.
But the file is large, is there a way to remove a part in the middle of the file, and needn't rewrite the whole file?
I don't know if a way to change the file in place, even using low-level file system commands, but you don't need to load it into a list, so you can do this without a large memory footprint:
with open('input_file', 'r') as input_file:
with open('output_file', 'w') as output_file:
for line in input_file:
if should_delete(line):
pass
else:
output_file.write(line)
This assumes that the section you want to delete is a line in a text file, and that should_delete is a function which determines whether the line should be kept or deleted. It is easy to change this slightly to work with a binary file instead, or to use a counter instead of a function.
Edit: If you're dealing with a binary file, you know the exact position you want to remove, and its not too near the start of the file, you may be able to optimise it slightly with io.IOBase.truncate (see http://docs.python.org/2/library/io.html#io.IOBase). However, I would only suggest pursuing this if the a profiler indicates that you really need to optimise to this extent.