Re-ordering text file - Python - python

I must re-order an input file and then print the output to a new file.
This is the input file:
The first line never changes.
The second line was a bit much longer.
The third line was short.
The fourth line was nearly the longer line.
The fifth was tiny.
The sixth line is just one line more.
The seventh line was the last line of the original file.
This is what the output file should look like:
The first line never changes.
The seventh line was the last line of the original file.
The second line was a bit much longer.
The sixth line is just one line more.
The third line was short.
The fifth was tiny.
The fourth line was nearly the longer line.
I have code already that reverse the input file and prints it to the output file which looks like this
ifile_name = open(ifile_name, 'r')
lines = ifile_name.readlines()
ofile_name = open(ofile_name, "w")
lines[-1] = lines[-1].rstrip() + '\n'
for line in reversed(lines):
ofile_name.write(line)
ifile_name.close()
ofile_name.close()
Is there anyway I can get the desired format in the text file while keeping my reverse code?
Such as print the first line of the input file, then reverse and print that line, the print the second line of the input file, then reverse and print that line etc.
Sorry if this may seem unclear I am very new to Python and stack overflow.
Thanks in advance.

This is a much elegant solution I believe if you don't care about the list generated.
with open("ifile_name","r") as f:
init_list=f.read().strip().splitlines()
with open("result.txt","a") as f1:
while True:
try:
f1.write(init_list.pop(0)+"\n")
f1.write(init_list.pop()+"\n")
except IndexError:
break

ifile_name = "hello/input.txt"
ofile_name = "hello/output.txt"
ifile_name = open(ifile_name, 'r')
lines = ifile_name.readlines()
ofile_name = open(ofile_name, "w")
lines[-1] = lines[-1].rstrip() + '\n'
start = 0
end = len(lines) - 1
while start < end:
ofile_name.write(lines[start])
ofile_name.write(lines[end])
start += 1
end -= 1
if start == end:
ofile_name.write(lines[start])
ifile_name.close()
ofile_name.close()
use two pivots start and end to point which line to write to the file.
once start == end, write the middle line to the file

Related

How to move the cursor to a specific line in python

I am reading data from a .txt file. I need to read lines, starting from a certain line, so I don't have to read the whole file (using .readlines()). Since I know which line i should start reading from, I came up with this(it does not work though):
def create_list(pos):
list_created = []
with open('text_file.txt', 'r') as f:
f.seek(pos) #Here I want to put the cursor at the begining of the line that I need to read from
line = f.readline() #And here I read the first line
while line != '<end>\n':
line = line.rstrip('\n')
list_created.append(line.split(' '))
line = f.readline()
f.close()
return list_created
print(create_list(2)) #Here i need to create a list starting from the 3rd line of my file
And my text file looks something like this:
Something #line in pos= 0
<start> #line in pos= 1
MY FIRST LINE #line in pos= 2
MY SECOND LINE #line in pos= 3
<end>
An the result should be somethign like:
[['MY', 'FIRST', 'LINE'], ['MY', 'SECOND', 'LINE']]
Basically, I need to start my readline() from a specific line.
Does this work? If you don't want to read the entire file with .readlines(), you can skip a single line by calling .readline(). This way you can call readline() as many times as you want to move your cursor down, then return the next line. Also, I don't recommend using line != '<end>\n' unless you're absolutely sure that there will be a newline after <end>. Instead, do something like not '<end>' in line:
def create_list(pos):
list_created = []
with open('text_file.txt', 'r') as f:
for i in range(pos):
f.readline()
line = f.readline() #And here I read the first line
while not '<end>' in line:
line = line.rstrip('\n')
list_created.append(line.split(' '))
line = f.readline()
f.close()
return list_created
print(create_list(2)) #Here i need to create a list starting from the 3rd line of my file
text_file.txt:
Something
<start>
MY FIRST LINE
MY SECOND LINE
<end>
Output:
[['MY', 'FIRST', 'LINE'], ['MY', 'SECOND', 'LINE']]

Deleting a specific number of lines from text file using Python

Suppose I have a text file that goes like this:
AAAAAAAAAAAAAAAAAAAAA #<--- line 1
BBBBBBBBBBBBBBBBBBBBB #<--- line 2
CCCCCCCCCCCCCCCCCCCCC #<--- line 3
DDDDDDDDDDDDDDDDDDDDD #<--- line 4
EEEEEEEEEEEEEEEEEEEEE #<--- line 5
FFFFFFFFFFFFFFFFFFFFF #<--- line 6
GGGGGGGGGGGGGGGGGGGGG #<--- line 7
HHHHHHHHHHHHHHHHHHHHH #<--- line 8
Ignore "#<--- line...", it's just for demonstration
Assumptions
I don't know what line 3 is going to contain (because it changes
all the time)...
The first 2 lines have to be deleted...
After the first 2 lines, I want to keep 3 lines...
Then, I want to delete all lines after the 3rd line.
End Result
The end result should look like this:
CCCCCCCCCCCCCCCCCCCCC #<--- line 3
DDDDDDDDDDDDDDDDDDDDD #<--- line 4
EEEEEEEEEEEEEEEEEEEEE #<--- line 5
Lines deleted: First 2 + Everything after the next 3 (i.e. after line 5)
Required
All Pythonic suggestions are welcome! Thanks!
Reference Material
https://thispointer.com/python-how-to-delete-specific-lines-in-a-file-in-a-memory-efficient-way/
def delete_multiple_lines(original_file, line_numbers):
"""In a file, delete the lines at line number in given list"""
is_skipped = False
counter = 0
# Create name of dummy / temporary file
dummy_file = original_file + '.bak'
# Open original file in read only mode and dummy file in write mode
with open(original_file, 'r') as read_obj, open(dummy_file, 'w') as write_obj:
# Line by line copy data from original file to dummy file
for line in read_obj:
# If current line number exist in list then skip copying that line
if counter not in line_numbers:
write_obj.write(line)
else:
is_skipped = True
counter += 1
# If any line is skipped then rename dummy file as original file
if is_skipped:
os.remove(original_file)
os.rename(dummy_file, original_file)
else:
os.remove(dummy_file)
Then...
delete_multiple_lines('sample.txt', [0,1,2])
The problem with this method might be that, if your file had 1-100 lines on top to delete, you'll have to specify [0,1,2...100]. Right?
Answer
Courtesy of #sandes
The following code will:
delete the first 63
get you the next 95
ignore the rest
create a new file
with open("sample.txt", "r") as f:
lines = f.readlines()
new_lines = []
idx_lines_wanted = [x for x in range(63,((63*2)+95))]
# delete first 63, then get the next 95
for i, line in enumerate(lines):
if i > len(idx_lines_wanted) -1:
break
if i in idx_lines_wanted:
new_lines.append(line)
with open("sample2.txt", "w") as f:
for line in new_lines:
f.write(line)
EDIT: iterating directly over f
based in #Kenny's comment and #chepner's suggestion
with open("your_file.txt", "r") as f:
new_lines = []
for idx, line in enumerate(f):
if idx in [x for x in range(2,5)]: #[2,3,4]
new_lines.append(line)
with open("your_new_file.txt", "w") as f:
for line in new_lines:
f.write(line)
This is really something that's better handled by an actual text editor.
import subprocess
subprocess.run(['ed', original_file], input=b'1,2d\n+3,$d\nwq\n')
A crash course in ed, the POSIX standard text editor.
ed opens the file named by its argument. It then proceeds to read commands from its standard input. Each command is a single character, with some commands taking one or two "addresses" to indicate which lines to operate on.
After each command, the "current" line number is set to the line last affected by a command. This is used with relative addresses, as we'll see in a moment.
1,2d means to delete lines 1 through 2; the current line is set to 2
+3,$d deletes all the lines from line 5 (current line is 2, so 2 + 3 == 5) through the end of the file ($ is a special address indicating the last line of the file)
wq writes all changes to disk and quits the editor.

Extract last line of very long txt file

I have a very long file containing data ("text.txt") and a single file that contains exactly 1 line that is the last line of text.txt. This single line should be overwritten every 10 minutes (done by a simple chronjob) as text.txt receives another line every 10 minutes.
So based on other code snippets I found on stackoverflow I currently run this code:
#!/usr/bin/env python
import os, sys
file = open(sys.argv[1], "r+")
#Move the pointer (similar to a cursor in a text editor) to the end of the file.
file.seek(0, os.SEEK_END)
#This code means the following code skips the very last character in the file -
#i.e. in the case the last line is null we delete the last line
#and the penultimate one
pos = file.tell() - 1
#Read each character in the file one at a time from the penultimate
#character going backwards, searching for a newline character
#If we find a new line, exit the search
while pos > 0 and file.read(1) != "\n":
pos -= 1
file.seek(pos, os.SEEK_SET)
#So long as we're not at the start of the file, delete all the characters ahead of this position
if pos > 0:
file.seek(pos, os.SEEK_SET)
w = open("new.txt",'w')
file.writelines(pos)
w.close()
file.close()
With this code I get the error:
TypeError: writelines() requires an iterable argument
(of course). When using file.truncate() I can get rid of the last line in the original file; but I want to keep it there and just extract that last line to new.txt. But I don't comprehend how this works when working with file.seek. So I'd need help for the last part of the code.
file.readlines() with lines[:-1] does not work properly with such huge files.
Not sure why you're opening w, only to close it without doing anything with it. If you want new.txt to have all the text from file starting at pos and ending at the end, how about:
if pos > 0:
file.seek(pos, os.SEEK_SET)
w = open("new.txt",'w')
w.write(file.read())
w.close()
According to your code, pos is an integer which is used to denote the position of first \n from the end of the file.
You cannot do - file.writelines(pos) , as writelines requires a list of lines. But pos is a single integer.
Also you want to write to new.txt , so you should use w file to write, not file . Example -
if pos > 0:
file.seek(pos, os.SEEK_SET)
w = open("new.txt",'w')
w.write(file.read())
w.close()
How about the following approach:
max_line_length = 1000
with open(sys.argv[1], "r") as f_long, open('new.txt', 'w') as f_new:
f_long.seek(-max_line_length, os.SEEK_END)
lines = [line for line in f_long.read().split("\n") if len(line)]
f_new.write(lines[-1])
This will seek to almost the end of the file and read the remaining part of the file in. It is then split into non-empty lines and the last entry is written to new.txt.
Here's how to tail the last 2 lines of a file into a list:
import subprocess
output = subprocess.check_output(['tail', '-n 2', '~/path/to/my_file.txt'])
lines = output.split('\n')
Now you can get the info you need out of the list lines.

Reset iteration index after using next() Python [duplicate]

This question already has answers here:
How can I iterate over overlapping (current, next) pairs of values from a list?
(12 answers)
Closed last month.
I am trying to edit a text file using fileinput.input(filename, inplace=1)
The text file has say 5 lines:
line 0
line 1
line 2
line 3
line 4
I wish to change data of line 1 based on info in line 2.
So I use a for loop
infile = fileinput.input(filename, inplace=1)
for line in infile:
if(line2Data):
#do something on line1
print line,
else:
line1=next(infile)
line2=next(infile)
#do something with line2
Now my problem is after the 1st iteration the line is set to line2 so in 2nd iteration the line is set to line3. I want line to be set to line1 in 2nd iteration. I have tried line = line but it doesn't work.
Can you please let me know how I am reset the iteration index on line which gets changed due to next
PS: This is a simple example of a huge file and function I am working on.
As far as I know (and that is not much) there is no way in resetting an iterator. This SO question is maybe useful. Since you say the file is huge, what I can think of is to process only part of the data. Following nosklos answer in this SO question, I would try something like this (but that is really just a first guess):
while True:
for line in open('really_big_file.dat')
process_data(line)
if some_condition==True:
break
Ok, your answer that you might want to start from the previous index is not captured with this attempt.
There is no way to reset the iterator, but there is nothing stopping your from doing some of your processing before you start your loop:
infile = fileinput.input("foo.txt")
first_lines = [next(infile) for x in range(3)]
first_lines[1] = first_lines[1].strip() + " this is line2 > " + first_lines[2]
print "\n".join(first_lines)
for line in infile:
print line
This uses next() to read the first 3 lines into a list. It then updates line1 based on line2 and prints all of them. It then continues to print the rest of the file using a normal loop.
For your sample, the output would be:
line 0
line 1 this is line2 > line 2
line 2
line 3
line 4
Note, if your are trying to modify the first lines of the file itself, rather than just display it, you would need to write the whole file to a new file. Writing to a file does not work like in a Word processor where all the lines move down when a line or character is added. It works as if you were in overwrite mode.

How to only read a file between certain phrases

Just a basic question. I know how to read information from a file etc but how would I go about only including the lines that are in between certain lines?
Say I have this :
Information Included in file but before "beginning of text"
" Beginning of text "
information I want
" end of text "
Information included in file but after the "end of text"
Thank you for any help you can give to get me started.
You can read the file in line by line until you reach the start-markerline, then do something with the lines (print them, store them in a list, etc) until you reach the end-markerline.
with open('myfile.txt') as f:
line = f.readline()
while line != ' Beginning of text \n':
line = f.readline()
while line != ' end of text \n':
# add code to do something with the line here
line = f.readline()
Make sure to exactly match the start- and end-markerlines. In your example they have a leading and trailing blank.
Yet another way to do it, is to use two-argument version of iter():
start = '" Beginning of text "\n'
end = '" end of text "\n'
with open('myfile.txt') as f:
for line in iter(f.readline, start):
pass
for line in iter(f.readline, end):
print line
see https://docs.python.org/2/library/functions.html#iter for details
I would just read the file line by line and check each line if it matches beginning or end string. The boolean readData then indicates if you are between beginning and end and you can read the actual information to another variable.
# Open the file
f = open('myTextFile.txt')
# Read the first line
line = f.readline()
readData=false;
# If the file is not empty keep reading line one at a time
# until the file is empty
while line:
# Check if line matches beginning
if line == "Beginning of text":
readData=true;
# Check if line matches end
if line == "end of text"
readData=false;
# We are between beginning and end
if readData:
(...)
line = f.readline()
f.close()

Categories