What am I missing here? Basic file i/o & string.find() - python

I've got this text file which lists all my movies and I thought I'd put them into a database of sorts. So first steps, read the text file, do some small manipulation and rewrite the file.
So, some lines contain multiple movie names separated by "AKA" so I need to turn that into two separate lines before I write it to the new file.
I have now struck to problems:
Losing the first line in the output file (which I solved by using an additional read/write outside the main loop omitting the check for "AKA" at this stage.
The if statement checking the result of the string.find("AKA") is never triggered.
The code is here:
fd1 = open("Movies_List_2.txt", "r")
fd2 = open("Movies_List_3.txt", "w")
inp_line = fd1.read() # This is to fix the missing first line in the output file
fd2.write(inp_line.strip() + "\n")
for line in fd1:
inp_line = fd1.read()
x = inp_line.find("ÄKA")
if x == -1: # <=== This never triggers
l = len(inp_line)
extn = inp_line[l-3:]
year = inp_line[l-8:l-4]
inp_line2 = inp_line[x+4:l-9]
inp_line = inp_line[:x-1]
fd2.write(inp_line + " " + year + "." + extn + "\n")
fd2.write(inp_line + " " + year + "." + extn + "]n")
else:
fd2.write(inp_line)
fd1.close()
fd2.close()
And a sample of the input file is here:
20 Million Miles to Earth 1957
20,000 Leagues Under the Sea 1954
2001 A Space Odyssey 1968
2010 The Year We Make Contact 1981
2017 AKA Shockwave 2017 <====== This should trigger the test
2036 Origin Unknown 2018
2046 2004
2050 2018
2067 2020
I'm almost certain that there is a fundamental I'm missing here, but I've spent a lot of time over it with no success.
Can someone point out where this code is going wrong?

This
fd1 = open("Movies_List_2.txt", "r")
...
for line in fd1:
inp_line = fd1.read()
...
fd1.close()
looks like you have mixing 2 different ways to read text file in python. These are:
reading everything as once, for example if I want to know number of e letters I can do:
f = open(filename, "r")
content = f.read()
f.close()
print(content.count("e"))
processing file line-by-line, for example if I want to know number of e letters in each line I can do:
f = open(filename, "r")
for line in f:
print(line.count("e"))
f.close()
Note that open might be also used as context manager, which simply speaking does handle closing for you, following is my 1st example using that feature:
with open(filename, "r") as f:
content = f.read()
print(content.count("e"))

Setting aside other dubious code in your script ("ÄKA" and "AKA" are not the same string), your test never triggers because you never loop inside the for.
inp_line = fd1.read() will read the whole file inside inp_line, not just one line! This causes the file descriptor to reach the end of file, so when you try to get the "next" line in:
for line in fd1:
the StopIteration triggers and you never actually iterate.
The solution is to just iterate line-by-line in the for loop:
fd1 = open("Movies_List_2.txt", "r")
fd2 = open("Movies_List_3.txt", "w")
for line in fd1:
x = line.find("ÄKA")
if x == -1: # <=== This never triggers
...
(Note that you need to restructure your loop to use line instead of inp_line.)

I think this is the solution you need.just a little fix.
fd1 = open("Movies_List_2.txt", "r")
fd2 = open("Movies_List_3.txt", "w")
for line in fd1:
line = line.strip()
if line.find("AKA")>0:
print(line.split('AKA'))
line_a,line_b = line.split('AKA')
fd2.write(line_a+"\n")
fd2.write(line_b.strip()+"\n")
else:
fd2.write(line+"\n")
fd1.close()
fd2.close()

How I would do it:
# Forget to close the files.
with open("Movies_List_2.txt") as fdIn:
with open("Movies_List_3.txt", "w") as fdOut:
# The thing...
for line in fdIn: # Read file line by line
line = line.strip()
# Single line movies. Just copy
if 'AKA' not in line:
print(line, file=fdOut)
continue
# Lines with multiple movies
head, year = line[:-5], line[-4:]
for movie in head.split('AKA'): # Split movies
movie = movie.strip()
print(f'{movie} {year}', file=fdOut) # Write move adding the year

Related

python3 split big file by delimiter into small files (not size, lines)

Newbie here. Ultimate mission is to learn how to take two big yaml files and split them into several hundred small files. I haven't yet figured out how to use the ID # as the filename, so one thing at a time.
First: split the big files into many. Here's a tiny bit of my test data file test-file.yml. Each post has a - delimiter on a line by itself:
-
ID: 627
more_post_meta_data_and_content
-
ID: 628
And here's my code that isn't working. So far I don't see why:
with open('test-file.yml', 'r') as myfile:
start = 0
cntr = 1
holding = ''
for i in myfile.read().split('\n'):
if (i == '-\n'):
if start==1:
with open(str(cntr) + '.md','w') as opfile:
opfile.write(op)
opfile.close()
holding=''
cntr += 1
else:
start=1
else:
if holding =='':
holding = i
else:
holding = holding + '\n' + i
myfile.close()
All hints, suggestions, pointers welcome. Thanks.
Reading the entire file into memory and then splitting the memory regions is very inefficient if the input files are large. Try this instead:
with open('test-file.yml', 'r') as myfile:
opfile = None
cntr = 1
for line in myfile:
if line == '-\n':
if opfile is not None:
opfile.close()
opfile = open('{0}.md'.format(cntr),'w')
cntr += 1
opfile.write(line)
opfile.close()
Notice also, you don't close things you have opened in a with context manager; the very purpose of the context manager is to take care of this for you.
As a newbie myself, at first glance your trying to write an undeclared variable op to your output. You were nearly spot on, just need to iterate through your opfile and write the contents:
with open('test-file.yml', 'r') as myfile:
start = 0
cntr = 1
holding = ''
for i in myfile.read().split('\n'):
if (i == '-\n'):
if start==1:
with open(str(cntr) + '.md','w') as opfile:
for line in opfile:
op = line
opfile.write(op)
opfile.close()
holding=''
cntr += 1
else:
start=1
else:
if holding =='':
holding = i
else:
holding = holding + '\n' + i
myfile.close()
Hope this helps!
When you are working in a with context on an open file, the with will automatically take care of closing it for you when you exit this block. So you don't need file.close() anywhere.
There is a function called readlines that outputs a generator that reads a line from an open file one line at a time. That will work much more efficiently than read() followed by a split(). Think about it. You are loading a massive file in memory, and then asking CPU to split that ginormous text by \n character. Not very efficient.
You wrote opfile.write(op). Where is this op defined? Don't you want to write the content in holding that you've defined?
Try the following.
with open('test.data', 'r') as myfile:
counter = 1
content = ""
start = True
for line in myfile.readlines():
if line == "-\n" and not start:
with open(str(counter) + '.md', 'w') as opfile:
opfile.write(content)
content = ""
counter += 1
else:
if not start:
content += line
start = False
# write the last file if test-file.yml doesn't end with a dash
if content != "":
with open(str(counter) + '.md', 'w') as opfile:
opfile.write(content)

Find unique entries in files

guess you have a solution concerning the following issue:
I want to compare two lists for common entries (on the basis of column 10) and write common entries to one file and unique entries for the first list into another file. The code I wrote is:
INFILE1 = open ("c:\\python\\test\\58962.filtered.csv", "r")
INFILE2 = open ("c:\\python\\test\\83887.filtered.csv", "r")
OUTFILE1 = open ("c:\\python\\test\\58962_vs_83887.common.csv", "w")
OUTFILE2 = open ("c:\\python\\test\\58962_vs_83887.unique.csv", "w")
for line in INFILE1:
line = line.rstrip().split(",")
if line[11] in INFILE2:
OUTFILE1.write(line)
else:
OUTFILE2.write(line)
INFILE1.close()
INFILE2.close()
OUTFILE1.close()
OUTFILE2.close()
The following error appears:
8 OUTFILE1.write(line)
9 else:
---> 10 OUTFILE2.write(line)
11 INFILE1.close()
TypeError: write() argument must be str, not list
Does somebody know about help for this?
Best
This line
line = line.rstrip().split(",")
replaces the line you read from a file by it's splitted list. You then try to write the splitted list to your file - thats not how the write method works and it tells you exactly that.
Change it to :
for line in INFILE1:
lineList = line.rstrip().split(",") # dont overwrite line, use lineList
if lineList[11] in INFILE2: # used lineList
OUTFILE1.write(line) # corrected indentation
else:
OUTFILE2.write(line)
You could have easily found your error yourself, just printing out the line before and after splitting or just befrore writing.
Please read How to debug small programs (#1) and follow it - its easier to find and fix bugs yourself then posting questions here.
You have some other problem at hand, though:
Files are stream based, they start with a position of 0 in the file. The position is advanced if you access parts of the file. When at the end, you wont get anything by using INFILE2.read() or other methods.
So if you want to repeatadly check if some lines column of file1 is somewhere in file2 you need to read file2 into a list (or other datastructure) so your repeated checks work. In other words, this:
if lineList[11] in INFILE2:
might work once, then the file is consumed and it will return false all the time.
You also might want to change from:
f = open(...., ...)
# do something with f
f.close()
to
with open(name,"r") as f:
# do something with f, no close needed, closed when leaving block
as it is safer, will close the file even if exceptions happen.
To solve that try this (untested) code:
with open ("c:\\python\\test\\83887.filtered.csv", "r") as file2:
infile2 = file2.readlines() # read in all lines as list
with open ("c:\\python\\test\\58962.filtered.csv", "r") as INFILE1:
# next 2 lines are 1 line, \ at end signifies line continues
with open ("c:\\python\\test\\58962_vs_83887.common.csv", "w") as OUTFILE1, \
with open ("c:\\python\\test\\58962_vs_83887.unique.csv", "w") as OUTFILE2:
for line in INFILE1:
lineList = line.rstrip().split(",")
if any(lineList[11] in x for x in infile2): # check the list of lines if
# any contains line[11]
OUTFILE1.write(line)
else:
OUTFILE2.write(line)
# all files are autoclosed here
Links to read:
the-with-statement
any() and other built-ins

Python3 - dumping a JSON data into penultimate line of a file [duplicate]

Is there a way to do this? Say I have a file that's a list of names that goes like this:
Alfred
Bill
Donald
How could I insert the third name, "Charlie", at line x (in this case 3), and automatically send all others down one line? I've seen other questions like this, but they didn't get helpful answers. Can it be done, preferably with either a method or a loop?
This is a way of doing the trick.
with open("path_to_file", "r") as f:
contents = f.readlines()
contents.insert(index, value)
with open("path_to_file", "w") as f:
contents = "".join(contents)
f.write(contents)
index and value are the line and value of your choice, lines starting from 0.
If you want to search a file for a substring and add a new text to the next line, one of the elegant ways to do it is the following:
import os, fileinput
old = "A"
new = "B"
for line in fileinput.FileInput(file_path, inplace=True):
if old in line :
line += new + os.linesep
print(line, end="")
There is a combination of techniques which I found useful in solving this issue:
with open(file, 'r+') as fd:
contents = fd.readlines()
contents.insert(index, new_string) # new_string should end in a newline
fd.seek(0) # readlines consumes the iterator, so we need to start over
fd.writelines(contents) # No need to truncate as we are increasing filesize
In our particular application, we wanted to add it after a certain string:
with open(file, 'r+') as fd:
contents = fd.readlines()
if match_string in contents[-1]: # Handle last line to prevent IndexError
contents.append(insert_string)
else:
for index, line in enumerate(contents):
if match_string in line and insert_string not in contents[index + 1]:
contents.insert(index + 1, insert_string)
break
fd.seek(0)
fd.writelines(contents)
If you want it to insert the string after every instance of the match, instead of just the first, remove the else: (and properly unindent) and the break.
Note also that the and insert_string not in contents[index + 1]: prevents it from adding more than one copy after the match_string, so it's safe to run repeatedly.
You can just read the data into a list and insert the new record where you want.
names = []
with open('names.txt', 'r+') as fd:
for line in fd:
names.append(line.split(' ')[-1].strip())
names.insert(2, "Charlie") # element 2 will be 3. in your list
fd.seek(0)
fd.truncate()
for i in xrange(len(names)):
fd.write("%d. %s\n" %(i + 1, names[i]))
The accepted answer has to load the whole file into memory, which doesn't work nicely for large files. The following solution writes the file contents with the new data inserted into the right line to a temporary file in the same directory (so on the same file system), only reading small chunks from the source file at a time. It then overwrites the source file with the contents of the temporary file in an efficient way (Python 3.8+).
from pathlib import Path
from shutil import copyfile
from tempfile import NamedTemporaryFile
sourcefile = Path("/path/to/source").resolve()
insert_lineno = 152 # The line to insert the new data into.
insert_data = "..." # Some string to insert.
with sourcefile.open(mode="r") as source:
destination = NamedTemporaryFile(mode="w", dir=str(sourcefile.parent))
lineno = 1
while lineno < insert_lineno:
destination.file.write(source.readline())
lineno += 1
# Insert the new data.
destination.file.write(insert_data)
# Write the rest in chunks.
while True:
data = source.read(1024)
if not data:
break
destination.file.write(data)
# Finish writing data.
destination.flush()
# Overwrite the original file's contents with that of the temporary file.
# This uses a memory-optimised copy operation starting from Python 3.8.
copyfile(destination.name, str(sourcefile))
# Delete the temporary file.
destination.close()
EDIT 2020-09-08: I just found an answer on Code Review that does something similar to above with more explanation - it might be useful to some.
You don't show us what the output should look like, so one possible interpretation is that you want this as the output:
Alfred
Bill
Charlie
Donald
(Insert Charlie, then add 1 to all subsequent lines.) Here's one possible solution:
def insert_line(input_stream, pos, new_name, output_stream):
inserted = False
for line in input_stream:
number, name = parse_line(line)
if number == pos:
print >> output_stream, format_line(number, new_name)
inserted = True
print >> output_stream, format_line(number if not inserted else (number + 1), name)
def parse_line(line):
number_str, name = line.strip().split()
return (get_number(number_str), name)
def get_number(number_str):
return int(number_str.split('.')[0])
def format_line(number, name):
return add_dot(number) + ' ' + name
def add_dot(number):
return str(number) + '.'
input_stream = open('input.txt', 'r')
output_stream = open('output.txt', 'w')
insert_line(input_stream, 3, 'Charlie', output_stream)
input_stream.close()
output_stream.close()
Parse the file into a python list using file.readlines() or file.read().split('\n')
Identify the position where you have to insert a new line, according to your criteria.
Insert a new list element there using list.insert().
Write the result to the file.
location_of_line = 0
with open(filename, 'r') as file_you_want_to_read:
#readlines in file and put in a list
contents = file_you_want_to_read.readlines()
#find location of what line you want to insert after
for index, line in enumerate(contents):
if line.startswith('whatever you are looking for')
location_of_line = index
#now you have a list of every line in that file
context.insert(location_of_line, "whatever you want to append to middle of file")
with open(filename, 'w') as file_to_write_to:
file_to_write_to.writelines(contents)
That is how I ended up getting whatever data I want to insert to the middle of the file.
this is just pseudo code, as I was having a hard time finding clear understanding of what is going on.
essentially you read in the file to its entirety and add it into a list, then you insert your lines that you want to that list, and then re-write to the same file.
i am sure there are better ways to do this, may not be efficient, but it makes more sense to me at least, I hope it makes sense to someone else.
A simple but not efficient way is to read the whole content, change it and then rewrite it:
line_index = 3
lines = None
with open('file.txt', 'r') as file_handler:
lines = file_handler.readlines()
lines.insert(line_index, 'Charlie')
with open('file.txt', 'w') as file_handler:
file_handler.writelines(lines)
I write this in order to reutilize/correct martincho's answer (accepted one)
! IMPORTANT: This code loads all the file into ram and rewrites content to the file
Variables index, value may be what you desire, but pay attention to making value string and end with '\n' if you don't want it to mess with existing data.
with open("path_to_file", "r+") as f:
# Read the content into a variable
contents = f.readlines()
contents.insert(index, value)
# Reset the reader's location (in bytes)
f.seek(0)
# Rewrite the content to the file
f.writelines(contents)
See the python docs about file.seek method: Python docs
Below is a slightly awkward solution for the special case in which you are creating the original file yourself and happen to know the insertion location (e.g. you know ahead of time that you will need to insert a line with an additional name before the third line, but won't know the name until after you've fetched and written the rest of the names). Reading, storing and then re-writing the entire contents of the file as described in other answers is, I think, more elegant than this option, but may be undesirable for large files.
You can leave a buffer of invisible null characters ('\0') at the insertion location to be overwritten later:
num_names = 1_000_000 # Enough data to make storing in a list unideal
max_len = 20 # The maximum allowed length of the inserted line
line_to_insert = 2 # The third line is at index 2 (0-based indexing)
with open(filename, 'w+') as file:
for i in range(line_to_insert):
name = get_name(i) # Returns 'Alfred' for i = 0, etc.
file.write(F'{i + 1}. {name}\n')
insert_position = file.tell() # Position to jump back to for insertion
file.write('\0' * max_len + '\n') # Buffer will show up as a blank line
for i in range(line_to_insert, num_names):
name = get_name(i)
file.write(F'{i + 2}. {name}\n') # Line numbering now bumped up by 1.
# Later, once you have the name to insert...
with open(filename, 'r+') as file: # Must use 'r+' to write to middle of file
file.seek(insert_position) # Move stream to the insertion line
name = get_bonus_name() # This lucky winner jumps up to 3rd place
new_line = F'{line_to_insert + 1}. {name}'
file.write(new_line[:max_len]) # Slice so you don't overwrite next line
Unfortunately there is no way to delete-without-replacement any excess null characters that did not get overwritten (or in general any characters anywhere in the middle of a file), unless you then re-write everything that follows. But the null characters will not affect how your file looks to a human (they have zero width).

Insert line at middle of file with Python?

Is there a way to do this? Say I have a file that's a list of names that goes like this:
Alfred
Bill
Donald
How could I insert the third name, "Charlie", at line x (in this case 3), and automatically send all others down one line? I've seen other questions like this, but they didn't get helpful answers. Can it be done, preferably with either a method or a loop?
This is a way of doing the trick.
with open("path_to_file", "r") as f:
contents = f.readlines()
contents.insert(index, value)
with open("path_to_file", "w") as f:
contents = "".join(contents)
f.write(contents)
index and value are the line and value of your choice, lines starting from 0.
If you want to search a file for a substring and add a new text to the next line, one of the elegant ways to do it is the following:
import os, fileinput
old = "A"
new = "B"
for line in fileinput.FileInput(file_path, inplace=True):
if old in line :
line += new + os.linesep
print(line, end="")
There is a combination of techniques which I found useful in solving this issue:
with open(file, 'r+') as fd:
contents = fd.readlines()
contents.insert(index, new_string) # new_string should end in a newline
fd.seek(0) # readlines consumes the iterator, so we need to start over
fd.writelines(contents) # No need to truncate as we are increasing filesize
In our particular application, we wanted to add it after a certain string:
with open(file, 'r+') as fd:
contents = fd.readlines()
if match_string in contents[-1]: # Handle last line to prevent IndexError
contents.append(insert_string)
else:
for index, line in enumerate(contents):
if match_string in line and insert_string not in contents[index + 1]:
contents.insert(index + 1, insert_string)
break
fd.seek(0)
fd.writelines(contents)
If you want it to insert the string after every instance of the match, instead of just the first, remove the else: (and properly unindent) and the break.
Note also that the and insert_string not in contents[index + 1]: prevents it from adding more than one copy after the match_string, so it's safe to run repeatedly.
You can just read the data into a list and insert the new record where you want.
names = []
with open('names.txt', 'r+') as fd:
for line in fd:
names.append(line.split(' ')[-1].strip())
names.insert(2, "Charlie") # element 2 will be 3. in your list
fd.seek(0)
fd.truncate()
for i in xrange(len(names)):
fd.write("%d. %s\n" %(i + 1, names[i]))
The accepted answer has to load the whole file into memory, which doesn't work nicely for large files. The following solution writes the file contents with the new data inserted into the right line to a temporary file in the same directory (so on the same file system), only reading small chunks from the source file at a time. It then overwrites the source file with the contents of the temporary file in an efficient way (Python 3.8+).
from pathlib import Path
from shutil import copyfile
from tempfile import NamedTemporaryFile
sourcefile = Path("/path/to/source").resolve()
insert_lineno = 152 # The line to insert the new data into.
insert_data = "..." # Some string to insert.
with sourcefile.open(mode="r") as source:
destination = NamedTemporaryFile(mode="w", dir=str(sourcefile.parent))
lineno = 1
while lineno < insert_lineno:
destination.file.write(source.readline())
lineno += 1
# Insert the new data.
destination.file.write(insert_data)
# Write the rest in chunks.
while True:
data = source.read(1024)
if not data:
break
destination.file.write(data)
# Finish writing data.
destination.flush()
# Overwrite the original file's contents with that of the temporary file.
# This uses a memory-optimised copy operation starting from Python 3.8.
copyfile(destination.name, str(sourcefile))
# Delete the temporary file.
destination.close()
EDIT 2020-09-08: I just found an answer on Code Review that does something similar to above with more explanation - it might be useful to some.
You don't show us what the output should look like, so one possible interpretation is that you want this as the output:
Alfred
Bill
Charlie
Donald
(Insert Charlie, then add 1 to all subsequent lines.) Here's one possible solution:
def insert_line(input_stream, pos, new_name, output_stream):
inserted = False
for line in input_stream:
number, name = parse_line(line)
if number == pos:
print >> output_stream, format_line(number, new_name)
inserted = True
print >> output_stream, format_line(number if not inserted else (number + 1), name)
def parse_line(line):
number_str, name = line.strip().split()
return (get_number(number_str), name)
def get_number(number_str):
return int(number_str.split('.')[0])
def format_line(number, name):
return add_dot(number) + ' ' + name
def add_dot(number):
return str(number) + '.'
input_stream = open('input.txt', 'r')
output_stream = open('output.txt', 'w')
insert_line(input_stream, 3, 'Charlie', output_stream)
input_stream.close()
output_stream.close()
Parse the file into a python list using file.readlines() or file.read().split('\n')
Identify the position where you have to insert a new line, according to your criteria.
Insert a new list element there using list.insert().
Write the result to the file.
location_of_line = 0
with open(filename, 'r') as file_you_want_to_read:
#readlines in file and put in a list
contents = file_you_want_to_read.readlines()
#find location of what line you want to insert after
for index, line in enumerate(contents):
if line.startswith('whatever you are looking for')
location_of_line = index
#now you have a list of every line in that file
context.insert(location_of_line, "whatever you want to append to middle of file")
with open(filename, 'w') as file_to_write_to:
file_to_write_to.writelines(contents)
That is how I ended up getting whatever data I want to insert to the middle of the file.
this is just pseudo code, as I was having a hard time finding clear understanding of what is going on.
essentially you read in the file to its entirety and add it into a list, then you insert your lines that you want to that list, and then re-write to the same file.
i am sure there are better ways to do this, may not be efficient, but it makes more sense to me at least, I hope it makes sense to someone else.
A simple but not efficient way is to read the whole content, change it and then rewrite it:
line_index = 3
lines = None
with open('file.txt', 'r') as file_handler:
lines = file_handler.readlines()
lines.insert(line_index, 'Charlie')
with open('file.txt', 'w') as file_handler:
file_handler.writelines(lines)
I write this in order to reutilize/correct martincho's answer (accepted one)
! IMPORTANT: This code loads all the file into ram and rewrites content to the file
Variables index, value may be what you desire, but pay attention to making value string and end with '\n' if you don't want it to mess with existing data.
with open("path_to_file", "r+") as f:
# Read the content into a variable
contents = f.readlines()
contents.insert(index, value)
# Reset the reader's location (in bytes)
f.seek(0)
# Rewrite the content to the file
f.writelines(contents)
See the python docs about file.seek method: Python docs
Below is a slightly awkward solution for the special case in which you are creating the original file yourself and happen to know the insertion location (e.g. you know ahead of time that you will need to insert a line with an additional name before the third line, but won't know the name until after you've fetched and written the rest of the names). Reading, storing and then re-writing the entire contents of the file as described in other answers is, I think, more elegant than this option, but may be undesirable for large files.
You can leave a buffer of invisible null characters ('\0') at the insertion location to be overwritten later:
num_names = 1_000_000 # Enough data to make storing in a list unideal
max_len = 20 # The maximum allowed length of the inserted line
line_to_insert = 2 # The third line is at index 2 (0-based indexing)
with open(filename, 'w+') as file:
for i in range(line_to_insert):
name = get_name(i) # Returns 'Alfred' for i = 0, etc.
file.write(F'{i + 1}. {name}\n')
insert_position = file.tell() # Position to jump back to for insertion
file.write('\0' * max_len + '\n') # Buffer will show up as a blank line
for i in range(line_to_insert, num_names):
name = get_name(i)
file.write(F'{i + 2}. {name}\n') # Line numbering now bumped up by 1.
# Later, once you have the name to insert...
with open(filename, 'r+') as file: # Must use 'r+' to write to middle of file
file.seek(insert_position) # Move stream to the insertion line
name = get_bonus_name() # This lucky winner jumps up to 3rd place
new_line = F'{line_to_insert + 1}. {name}'
file.write(new_line[:max_len]) # Slice so you don't overwrite next line
Unfortunately there is no way to delete-without-replacement any excess null characters that did not get overwritten (or in general any characters anywhere in the middle of a file), unless you then re-write everything that follows. But the null characters will not affect how your file looks to a human (they have zero width).

Python- How to Remove Columns from a File

I'd like to remove the first column from a file. The file contains 3 columns separated by space and the columns has the following titles:
X', 'Displacement' and 'Force' (Please see the image).
I have came up with the following code, but to my disappointment it doesn't work!
f = open("datafile.txt", 'w')
for line in f:
line = line.split()
del x[0]
f.close()
Any help is much appreciated !
Esan
First of all, you're attempting to read from a file (by iterating through the file contents) that is open for writing. This will give you an IOError.
Second, there is no variable named x in existence (you have not declared/set one in the script). This will generate a NameError.
Thirdly and finally, once you have finished (correctly) reading and editing the columns in your file, you will need to write the data back into the file.
To avoid loading a (potentially large) file into memory all at once, it is probably a good idea to read from one file (line by line) and write to a new file simultaneously.
Something like this might work:
f = open("datafile.txt", "r")
g = open("datafile_fixed.txt", "w")
for line in f:
if line.strip():
g.write("\t".join(line.split()[1:]) + "\n")
f.close()
g.close()
Some reading about python i/o might be helpful, but something like the following should get you on your feet:
with open("datafile.txt", "r") as fin:
with open("outputfile.txt", "w") as fout:
for line in fin:
line = line.split(' ')
if len(line) == 3:
del line[0]
fout.write(line[0] + ' ' + line[1])
else:
fout.write('\n')
EDIT: fixed to work with blank lines
print ''.join([' '.join(l.split()[1:]) for l in file('datafile.txt')])
or, if you want to preserve spaces and you know that the second column always starts at the, say, 10th character:
print ''.join([l[11:] for l in file('datafile.txt')])

Categories