I have the following .txt-File (modified bash emboss-dreg report, the original report has seqtable format):
Start End Strand Pattern Sequence
43392 43420 + regex:[T][G][A][TC][C][CTG]\D{15,17}[CA][G][T][AT][AT][CTA] TGATCGCACGCCGAATGGAAACACGTTTT
52037 52064 + regex:[T][G][A][TC][C][CTG]\D{15,17}[CA][G][T][AT][AT][CTA] TGACCCTGCTTGGCGATCCCGGCGTTTC
188334 188360 + regex:[T][G][A][TC][C][CTG]\D{15,17}[CA][G][T][AT][AT][CTA] TGATCGCGCAACTGCAGCGGGAGTTAC
I would like to access the elements under "sequence" only, to compare them with some variables and delete the whole lines, if the comparison does not give the desired result (using Levenshtein distance for comparison).
But I can't even get started .... :(
I am searching for something like the linux -f option, to directly get to the right "field" in the line to do my comparison.
I came across re.split:
with open(textFile) as f:
for line in f:
cleaned=re.split(r'\t',line)
print(cleaned)
which results in:
[' Start End Strand Pattern Sequence\n']
['\n']
[' 43392 43420 + regex:[T][G][A][TC][C][CTG]\\D{15,17}[CA][G][T][AT][AT][CTA] TGATCGCACGCCGAATGGAAACACGTTTT\n']
['\n']
[' 52037 52064 + regex:[T][G][A][TC][C][CTG]\\D{15,17}[CA][G][T][AT][AT][CTA] TGACCCTGCTTGGCGATCCCGGCGTTTC\n']
['\n']
[' 188334 188360 + regex:[T][G][A][TC][C][CTG]\\D{15,17}[CA][G][T][AT][AT][CTA] TGATCGCGCAACTGCAGCGGGAGTTAC\n']
['\n']
That is the closest I got to "split my lines into elements". I feel like totally going the wrong way, but searching Stack Overflow and google did not result in anything :(
I have never worked with seqtable-format before, so I tried to deal with it as .txt Maybe, there is another approach better for dealing with it?
Python is the main language I am learning, I am not so firm in Bash, but bash-answers for dealing with the issue would be ok for me, too.
I am thankful for any hint/link/help :)
The format itself seems to be using multiple lines as delimiters while your r'\t' is not doing anything (you're instructing Python to split on a literal \t). Also, based on what you've pasted the data is not using a tab delimiter anyway, but a random number of whitespaces to pad the table.
To address both, you can read the file, treat the first line as a header (if you need it), then read the rest line by line, strip the trailing\leading whitespace, check if there is any data there and if there is - further split it on whitespace to get to your line elements:
with open("your_data", "r") as f:
header = f.readline().split() # read the first line as a header
for line in f: # read the rest of the file line-by-line
line = line.strip() # first clear out the whitespace
if line: # check if there is any content left or is it an empty line
elements = line.split() # split the data on whitespace to get your elements
print(elements[-1]) # print the last element
TGATCGCACGCCGAATGGAAACACGTTTT
TGACCCTGCTTGGCGATCCCGGCGTTTC
TGATCGCGCAACTGCAGCGGGAGTTAC
As a bonus, since you have the header, you can turn it into a map and then use 'proxied' named access to get the element you're looking for so you don't need to worry about the element position:
with open("your_data", "r") as f:
# read the header and turn it into a value:index map
header = {v: i for i, v in enumerate(f.readline().split())}
for line in f: # read the rest of the file line-by-line
line = line.strip() # first clear out the whitespace
if line: # check if there is any content left or is it an empty line
elements = line.split()
print(elements[header["Sequence"]]) # print the Sequence element
You can also use a header map to turn your rows into dict structures for even easier access.
UPDATE: Here's how to create a header map and then use it to build a dict out of your lines:
with open("your_data", "r") as f:
# read the header and turn it into an index:value map
header = {i: v for i, v in enumerate(f.readline().split())}
for line in f: # read the rest of the file line-by-line
line = line.strip() # first clear out the whitespace
if line: # check if there is any content left or is it an empty line
# split the line, iterate over it and use the header map to create a dict
row = {header[i]: v for i, v in enumerate(line.split())}
print(row["Sequence"]) # ... or you can append it to a list for later use
As for how to 'delete' lines that you don't want for some reason, you'll have to create a temporary file, loop through your original file, compare your values, write the ones that you want to keep into the temporary file, delete the original file and finally rename the temporary file to match your original file, something like:
import shutil
from tempfile import NamedTemporaryFile
SOURCE_FILE = "your_data" # path to the original file to process
def compare_func(seq): # a simple comparison function for our sequence
return not seq.endswith("TC") # use Levenshtein distance or whatever you want instead
# open a temporary file for writing and our source file for reading
with NamedTemporaryFile(mode="w", delete=False) as t, open(SOURCE_FILE, "r") as f:
header_line = f.readline() # read the header
t.write(header_line) # write the header immediately to the temporary file
header = {v: i for i, v in enumerate(header_line.split())} # create a header map
last_line = "" # a var to store the whitespace to keep the same format
for line in f: # read the rest of the file line-by-line
row = line.strip() # first clear out the whitespace
if row: # check if there is any content left or is it an empty line
elements = row.split() # split the row into elements
# now lets call our comparison function
if compare_func(elements[header["Sequence"]]): # keep the line if True
t.write(last_line) # write down the last whitespace to the temporary file
t.write(line) # write down the current line to the temporary file
else:
last_line = line # store the whitespace for later use
shutil.move(t.name, SOURCE_FILE) # finally, overwrite the source with the temporary file
This will produce the same file sans the second row from your example since its sequence ends in a TC and our comp_function() returns False in that case.
For a bit less complexity, instead of using temporary files you can load your whole source file into the working memory and then just overwrite it, but that would work only for files that can fit your working memory while the above approach can work with files as large as your free storage space.
Related
I have a .fastq file (cannot use Biopython) that consists of multiple samples in different lines. The file contents look like this:
#sample1
ACGTC.....
+
IIIIDDDDDFF
#sample2
AGCGC....
+
IIIIIDFDFD
.
.
.
#sampleX
ACATAG
+
IIIIIDDDFFF
I want to take the file and separate out each individual set of samples (i.e. lines 1-4, 5-8 and so on until the end of the file) and write each of them to a separate file (i.e. sample1.fastq contains that contents of sample 1 lines 1-4 and so on). Is this doable using loops in python?
You can use defaultdict and regex for this
import re
from collections import defaultdict
# Get file contents
with open("test.fastq", "r") as f:
content = f.read()
samples = defaultdict(list) # Make defaultdict of empty lists
identifier = ""
# Iterate through every line in file
for line in content.split("\n"):
# Find strings which start with #
if re.match("^#.*", line):
# Set identifier to match following lines to this section
identifier = line.replace("#", "")
else:
# Add the line to its identifier
samples[identifier].append(line)
Now all you have to do is save the contents of this default dictionary into multiple files:
# Loop through all samples (and their contents)
for sample_name, sample_items in samples.items():
# Create new file with the name of its sample_name.fastq
# (You might want to change the naming)
with open(f"{sample_name}.fastq", "w") as f:
# Write each element of the sample_items to new line
f.write("\n".join(sample_items))
It might be helpful for you to also include #sample_name in the beginning of the file (first line), but I'm not sure you want that so I haven't added that.
Note that you can adjust the regex settings to only match #sample[number] instead of all #..., if you want that, you can use re.match("^#sample\d+") instead
I am currently working on a project and I need to test if, on the last row (line) of the input, I have this byte: '\x1a'. If the last row has this marker I want to delete the entire row.
I have this code so far, but i don't know how to make it test for that byte on the last row and delete it.
Thank you!
readFile1 = open("sdn.csv")
lines1 = readFile1.readlines()
readFile1.close()
w1 = open("sdn.csv", 'w')
w1.writelines([item for item in lines1[:-1]])
w1.close()
readFile2 = open("add.csv")
lines2 = readFile2.readlines()
readFile2.close()
w2 = open("add.csv",'w')
w2.writelines([item for item in lines2[:-1]])
w2.close()
readFile3 = open("alt.csv")
lines3 = readFile3.readlines()
readFile3.close()
w = open("alt.csv",'w')
w.writelines([item for item in lines3[:-1]])
w.close()
In any of your code blocks, you have read your file's contents into a variable with a line like:
lines1 = readFile1.readlines()
If you want to see if the \x1a byte exists anywhere in the last line of the text, then you can do this:
if '\x1a' in lines1[-1]:
# whatever you need to do
If you want to find the byte and then actually delete the row from the list altogether:
if '\x1a' in lines1[-1]:
# \x1a byte was found, remove the last item from list
del lines1[-1]
And if I may offer a suggestion, all your code blocks repeat. You could create a function which captures all the functionality and then pass file names to it.
def process_csv(file_name):
# Open the file for both reading and writing
# This will also automatically close the file handle after
# you're done with it
with open(file_name, 'r+') as csv_file:
data = csv_file.readlines()
if '\x1a' in data[-1]:
# erase file and then write data without last row to it
csv_file.seek(0)
csv_file.truncate()
csv_file.writelines(data[:-1])
else:
# Just making this explicit
# Don't do anything to the file if the \x1a byte wasn't found
pass
for f in ('sdn.csv', 'add.csv', 'alt.csv'):
process_csv(f)
Is there a way to do this? Say I have a file that's a list of names that goes like this:
Alfred
Bill
Donald
How could I insert the third name, "Charlie", at line x (in this case 3), and automatically send all others down one line? I've seen other questions like this, but they didn't get helpful answers. Can it be done, preferably with either a method or a loop?
This is a way of doing the trick.
with open("path_to_file", "r") as f:
contents = f.readlines()
contents.insert(index, value)
with open("path_to_file", "w") as f:
contents = "".join(contents)
f.write(contents)
index and value are the line and value of your choice, lines starting from 0.
If you want to search a file for a substring and add a new text to the next line, one of the elegant ways to do it is the following:
import os, fileinput
old = "A"
new = "B"
for line in fileinput.FileInput(file_path, inplace=True):
if old in line :
line += new + os.linesep
print(line, end="")
There is a combination of techniques which I found useful in solving this issue:
with open(file, 'r+') as fd:
contents = fd.readlines()
contents.insert(index, new_string) # new_string should end in a newline
fd.seek(0) # readlines consumes the iterator, so we need to start over
fd.writelines(contents) # No need to truncate as we are increasing filesize
In our particular application, we wanted to add it after a certain string:
with open(file, 'r+') as fd:
contents = fd.readlines()
if match_string in contents[-1]: # Handle last line to prevent IndexError
contents.append(insert_string)
else:
for index, line in enumerate(contents):
if match_string in line and insert_string not in contents[index + 1]:
contents.insert(index + 1, insert_string)
break
fd.seek(0)
fd.writelines(contents)
If you want it to insert the string after every instance of the match, instead of just the first, remove the else: (and properly unindent) and the break.
Note also that the and insert_string not in contents[index + 1]: prevents it from adding more than one copy after the match_string, so it's safe to run repeatedly.
You can just read the data into a list and insert the new record where you want.
names = []
with open('names.txt', 'r+') as fd:
for line in fd:
names.append(line.split(' ')[-1].strip())
names.insert(2, "Charlie") # element 2 will be 3. in your list
fd.seek(0)
fd.truncate()
for i in xrange(len(names)):
fd.write("%d. %s\n" %(i + 1, names[i]))
The accepted answer has to load the whole file into memory, which doesn't work nicely for large files. The following solution writes the file contents with the new data inserted into the right line to a temporary file in the same directory (so on the same file system), only reading small chunks from the source file at a time. It then overwrites the source file with the contents of the temporary file in an efficient way (Python 3.8+).
from pathlib import Path
from shutil import copyfile
from tempfile import NamedTemporaryFile
sourcefile = Path("/path/to/source").resolve()
insert_lineno = 152 # The line to insert the new data into.
insert_data = "..." # Some string to insert.
with sourcefile.open(mode="r") as source:
destination = NamedTemporaryFile(mode="w", dir=str(sourcefile.parent))
lineno = 1
while lineno < insert_lineno:
destination.file.write(source.readline())
lineno += 1
# Insert the new data.
destination.file.write(insert_data)
# Write the rest in chunks.
while True:
data = source.read(1024)
if not data:
break
destination.file.write(data)
# Finish writing data.
destination.flush()
# Overwrite the original file's contents with that of the temporary file.
# This uses a memory-optimised copy operation starting from Python 3.8.
copyfile(destination.name, str(sourcefile))
# Delete the temporary file.
destination.close()
EDIT 2020-09-08: I just found an answer on Code Review that does something similar to above with more explanation - it might be useful to some.
You don't show us what the output should look like, so one possible interpretation is that you want this as the output:
Alfred
Bill
Charlie
Donald
(Insert Charlie, then add 1 to all subsequent lines.) Here's one possible solution:
def insert_line(input_stream, pos, new_name, output_stream):
inserted = False
for line in input_stream:
number, name = parse_line(line)
if number == pos:
print >> output_stream, format_line(number, new_name)
inserted = True
print >> output_stream, format_line(number if not inserted else (number + 1), name)
def parse_line(line):
number_str, name = line.strip().split()
return (get_number(number_str), name)
def get_number(number_str):
return int(number_str.split('.')[0])
def format_line(number, name):
return add_dot(number) + ' ' + name
def add_dot(number):
return str(number) + '.'
input_stream = open('input.txt', 'r')
output_stream = open('output.txt', 'w')
insert_line(input_stream, 3, 'Charlie', output_stream)
input_stream.close()
output_stream.close()
Parse the file into a python list using file.readlines() or file.read().split('\n')
Identify the position where you have to insert a new line, according to your criteria.
Insert a new list element there using list.insert().
Write the result to the file.
location_of_line = 0
with open(filename, 'r') as file_you_want_to_read:
#readlines in file and put in a list
contents = file_you_want_to_read.readlines()
#find location of what line you want to insert after
for index, line in enumerate(contents):
if line.startswith('whatever you are looking for')
location_of_line = index
#now you have a list of every line in that file
context.insert(location_of_line, "whatever you want to append to middle of file")
with open(filename, 'w') as file_to_write_to:
file_to_write_to.writelines(contents)
That is how I ended up getting whatever data I want to insert to the middle of the file.
this is just pseudo code, as I was having a hard time finding clear understanding of what is going on.
essentially you read in the file to its entirety and add it into a list, then you insert your lines that you want to that list, and then re-write to the same file.
i am sure there are better ways to do this, may not be efficient, but it makes more sense to me at least, I hope it makes sense to someone else.
A simple but not efficient way is to read the whole content, change it and then rewrite it:
line_index = 3
lines = None
with open('file.txt', 'r') as file_handler:
lines = file_handler.readlines()
lines.insert(line_index, 'Charlie')
with open('file.txt', 'w') as file_handler:
file_handler.writelines(lines)
I write this in order to reutilize/correct martincho's answer (accepted one)
! IMPORTANT: This code loads all the file into ram and rewrites content to the file
Variables index, value may be what you desire, but pay attention to making value string and end with '\n' if you don't want it to mess with existing data.
with open("path_to_file", "r+") as f:
# Read the content into a variable
contents = f.readlines()
contents.insert(index, value)
# Reset the reader's location (in bytes)
f.seek(0)
# Rewrite the content to the file
f.writelines(contents)
See the python docs about file.seek method: Python docs
Below is a slightly awkward solution for the special case in which you are creating the original file yourself and happen to know the insertion location (e.g. you know ahead of time that you will need to insert a line with an additional name before the third line, but won't know the name until after you've fetched and written the rest of the names). Reading, storing and then re-writing the entire contents of the file as described in other answers is, I think, more elegant than this option, but may be undesirable for large files.
You can leave a buffer of invisible null characters ('\0') at the insertion location to be overwritten later:
num_names = 1_000_000 # Enough data to make storing in a list unideal
max_len = 20 # The maximum allowed length of the inserted line
line_to_insert = 2 # The third line is at index 2 (0-based indexing)
with open(filename, 'w+') as file:
for i in range(line_to_insert):
name = get_name(i) # Returns 'Alfred' for i = 0, etc.
file.write(F'{i + 1}. {name}\n')
insert_position = file.tell() # Position to jump back to for insertion
file.write('\0' * max_len + '\n') # Buffer will show up as a blank line
for i in range(line_to_insert, num_names):
name = get_name(i)
file.write(F'{i + 2}. {name}\n') # Line numbering now bumped up by 1.
# Later, once you have the name to insert...
with open(filename, 'r+') as file: # Must use 'r+' to write to middle of file
file.seek(insert_position) # Move stream to the insertion line
name = get_bonus_name() # This lucky winner jumps up to 3rd place
new_line = F'{line_to_insert + 1}. {name}'
file.write(new_line[:max_len]) # Slice so you don't overwrite next line
Unfortunately there is no way to delete-without-replacement any excess null characters that did not get overwritten (or in general any characters anywhere in the middle of a file), unless you then re-write everything that follows. But the null characters will not affect how your file looks to a human (they have zero width).
I have got a file in python with filenames. I want to delete some lines and some substirng of the filename using python code. My file format is the above:
img/1.jpg
img/10.jpg
img/100.jpg 0 143 84 227
...
I want to delete the img/substring from all the file and the lines where the coordinates are missing. For the second task I did the following:
for con in content:
if ".jpg\n" in con:
content.remove(con)
for con in content:
print con
However content didn't change.
You're attempting to modify the list content while iterating over it. This will very quickly bite you in the knees.
Instead, in python you generate a new list:
>>> content = [fn for fn in content if not fn.endswith(".jpg\n")]
>>>
After this you can overwrite the file you read from with the contents from... contents. The above example assumes there is no whitespace to accomodate for in between the filename and the newline.
The error in your current method is because you are iterating through each line by letter, for l in somestring: will go letter by letter. Obviously, a ".jpg\n" won't be in a single letter, so you never hit content.remove(con).
I would suggest a slightly different approach:
with open("fileofdata.txt", 'r') as f:
content = [line for line in f.readlines() if len(line.split()) > 1]
Using len(line.split()) is more robust than line.endswith() because it allows for withspace between .jpg and \n.
Is there a way to do this? Say I have a file that's a list of names that goes like this:
Alfred
Bill
Donald
How could I insert the third name, "Charlie", at line x (in this case 3), and automatically send all others down one line? I've seen other questions like this, but they didn't get helpful answers. Can it be done, preferably with either a method or a loop?
This is a way of doing the trick.
with open("path_to_file", "r") as f:
contents = f.readlines()
contents.insert(index, value)
with open("path_to_file", "w") as f:
contents = "".join(contents)
f.write(contents)
index and value are the line and value of your choice, lines starting from 0.
If you want to search a file for a substring and add a new text to the next line, one of the elegant ways to do it is the following:
import os, fileinput
old = "A"
new = "B"
for line in fileinput.FileInput(file_path, inplace=True):
if old in line :
line += new + os.linesep
print(line, end="")
There is a combination of techniques which I found useful in solving this issue:
with open(file, 'r+') as fd:
contents = fd.readlines()
contents.insert(index, new_string) # new_string should end in a newline
fd.seek(0) # readlines consumes the iterator, so we need to start over
fd.writelines(contents) # No need to truncate as we are increasing filesize
In our particular application, we wanted to add it after a certain string:
with open(file, 'r+') as fd:
contents = fd.readlines()
if match_string in contents[-1]: # Handle last line to prevent IndexError
contents.append(insert_string)
else:
for index, line in enumerate(contents):
if match_string in line and insert_string not in contents[index + 1]:
contents.insert(index + 1, insert_string)
break
fd.seek(0)
fd.writelines(contents)
If you want it to insert the string after every instance of the match, instead of just the first, remove the else: (and properly unindent) and the break.
Note also that the and insert_string not in contents[index + 1]: prevents it from adding more than one copy after the match_string, so it's safe to run repeatedly.
You can just read the data into a list and insert the new record where you want.
names = []
with open('names.txt', 'r+') as fd:
for line in fd:
names.append(line.split(' ')[-1].strip())
names.insert(2, "Charlie") # element 2 will be 3. in your list
fd.seek(0)
fd.truncate()
for i in xrange(len(names)):
fd.write("%d. %s\n" %(i + 1, names[i]))
The accepted answer has to load the whole file into memory, which doesn't work nicely for large files. The following solution writes the file contents with the new data inserted into the right line to a temporary file in the same directory (so on the same file system), only reading small chunks from the source file at a time. It then overwrites the source file with the contents of the temporary file in an efficient way (Python 3.8+).
from pathlib import Path
from shutil import copyfile
from tempfile import NamedTemporaryFile
sourcefile = Path("/path/to/source").resolve()
insert_lineno = 152 # The line to insert the new data into.
insert_data = "..." # Some string to insert.
with sourcefile.open(mode="r") as source:
destination = NamedTemporaryFile(mode="w", dir=str(sourcefile.parent))
lineno = 1
while lineno < insert_lineno:
destination.file.write(source.readline())
lineno += 1
# Insert the new data.
destination.file.write(insert_data)
# Write the rest in chunks.
while True:
data = source.read(1024)
if not data:
break
destination.file.write(data)
# Finish writing data.
destination.flush()
# Overwrite the original file's contents with that of the temporary file.
# This uses a memory-optimised copy operation starting from Python 3.8.
copyfile(destination.name, str(sourcefile))
# Delete the temporary file.
destination.close()
EDIT 2020-09-08: I just found an answer on Code Review that does something similar to above with more explanation - it might be useful to some.
You don't show us what the output should look like, so one possible interpretation is that you want this as the output:
Alfred
Bill
Charlie
Donald
(Insert Charlie, then add 1 to all subsequent lines.) Here's one possible solution:
def insert_line(input_stream, pos, new_name, output_stream):
inserted = False
for line in input_stream:
number, name = parse_line(line)
if number == pos:
print >> output_stream, format_line(number, new_name)
inserted = True
print >> output_stream, format_line(number if not inserted else (number + 1), name)
def parse_line(line):
number_str, name = line.strip().split()
return (get_number(number_str), name)
def get_number(number_str):
return int(number_str.split('.')[0])
def format_line(number, name):
return add_dot(number) + ' ' + name
def add_dot(number):
return str(number) + '.'
input_stream = open('input.txt', 'r')
output_stream = open('output.txt', 'w')
insert_line(input_stream, 3, 'Charlie', output_stream)
input_stream.close()
output_stream.close()
Parse the file into a python list using file.readlines() or file.read().split('\n')
Identify the position where you have to insert a new line, according to your criteria.
Insert a new list element there using list.insert().
Write the result to the file.
location_of_line = 0
with open(filename, 'r') as file_you_want_to_read:
#readlines in file and put in a list
contents = file_you_want_to_read.readlines()
#find location of what line you want to insert after
for index, line in enumerate(contents):
if line.startswith('whatever you are looking for')
location_of_line = index
#now you have a list of every line in that file
context.insert(location_of_line, "whatever you want to append to middle of file")
with open(filename, 'w') as file_to_write_to:
file_to_write_to.writelines(contents)
That is how I ended up getting whatever data I want to insert to the middle of the file.
this is just pseudo code, as I was having a hard time finding clear understanding of what is going on.
essentially you read in the file to its entirety and add it into a list, then you insert your lines that you want to that list, and then re-write to the same file.
i am sure there are better ways to do this, may not be efficient, but it makes more sense to me at least, I hope it makes sense to someone else.
A simple but not efficient way is to read the whole content, change it and then rewrite it:
line_index = 3
lines = None
with open('file.txt', 'r') as file_handler:
lines = file_handler.readlines()
lines.insert(line_index, 'Charlie')
with open('file.txt', 'w') as file_handler:
file_handler.writelines(lines)
I write this in order to reutilize/correct martincho's answer (accepted one)
! IMPORTANT: This code loads all the file into ram and rewrites content to the file
Variables index, value may be what you desire, but pay attention to making value string and end with '\n' if you don't want it to mess with existing data.
with open("path_to_file", "r+") as f:
# Read the content into a variable
contents = f.readlines()
contents.insert(index, value)
# Reset the reader's location (in bytes)
f.seek(0)
# Rewrite the content to the file
f.writelines(contents)
See the python docs about file.seek method: Python docs
Below is a slightly awkward solution for the special case in which you are creating the original file yourself and happen to know the insertion location (e.g. you know ahead of time that you will need to insert a line with an additional name before the third line, but won't know the name until after you've fetched and written the rest of the names). Reading, storing and then re-writing the entire contents of the file as described in other answers is, I think, more elegant than this option, but may be undesirable for large files.
You can leave a buffer of invisible null characters ('\0') at the insertion location to be overwritten later:
num_names = 1_000_000 # Enough data to make storing in a list unideal
max_len = 20 # The maximum allowed length of the inserted line
line_to_insert = 2 # The third line is at index 2 (0-based indexing)
with open(filename, 'w+') as file:
for i in range(line_to_insert):
name = get_name(i) # Returns 'Alfred' for i = 0, etc.
file.write(F'{i + 1}. {name}\n')
insert_position = file.tell() # Position to jump back to for insertion
file.write('\0' * max_len + '\n') # Buffer will show up as a blank line
for i in range(line_to_insert, num_names):
name = get_name(i)
file.write(F'{i + 2}. {name}\n') # Line numbering now bumped up by 1.
# Later, once you have the name to insert...
with open(filename, 'r+') as file: # Must use 'r+' to write to middle of file
file.seek(insert_position) # Move stream to the insertion line
name = get_bonus_name() # This lucky winner jumps up to 3rd place
new_line = F'{line_to_insert + 1}. {name}'
file.write(new_line[:max_len]) # Slice so you don't overwrite next line
Unfortunately there is no way to delete-without-replacement any excess null characters that did not get overwritten (or in general any characters anywhere in the middle of a file), unless you then re-write everything that follows. But the null characters will not affect how your file looks to a human (they have zero width).