Removing lines above specific line in text in python - python

I have text like this
ABC
DEF
Ref.By
AAA
AAA
I want remove all the line before the line Ref.By.
How can I do it in python ?

Try this
text_str = """ABC
DEF
Ref.By
AAA
AAA"""
text_lines = text_str.split("\n")
idx = text_lines.index("Ref.By") + 1
result_text = "\n".join(text_lines[idx:])
print(result_text)

lines = open('your_file.txt', 'r').readlines()
search = 'Ref.By'
for i, line in enumerate(lines):
if search in line:
break
if i < len(lines) - 1:
with open('your_file.txt', 'w') as f:
f.write('\n'.join(lines[i + 1:]))
This is alright provided your file size is well within 2-4 MB. It becomes problematic to store it in memory beyond that point.

Related

Iterate two lines at a time over text file, while incrementing one line at a time in python

So let's say I have a text file, which contains this:
a
b
c
d
e
I want to iterate through every line of this file, but in the process also get the line following the first line. I have tried this:
with open(txt_file, "r") as f:
for line1, line2 in itertools.zip_longest(*[f] * 2):
if line2 != None:
print(line1.rstrip() + line2.rstrip())
else:
print(line1.rstrip())
which returns something like:
ab
cd
e
However, I would like to have output like this:
ab
bc
cd
de
e
Anyone have an idea for how to accomplish this? Thanks in advance!
Why iterator? Simply cache one line:
with open("t.txt","w") as f:
f.write("a\nb\nc\nd\ne")
with open("t.txt", "r") as f:
ll = next(f) # get the first line
for line in f: # get the remaining ones
print(ll.rstrip() + line.rstrip())
ll = line # cache current line as last line
print(ll) # get last one
Output:
ab
bc
cd
de
e
with open(txt_file, "r") as f:
last = None
for line in f:
if not last is None:
print(last + line.rstrip())
last = line.rstrip()
# print the last line
print line.rstrip()
A simple solution would be:
with open(txt_file, "r") as f:
content = f.read().splitlines()
for i, line in enumerate(content):
if i == len(content) - 1:
print(line)
else:
print(line + content[i+1])
You can also create a generator which takes an iterable as an input parameter and yields tuples of (previous_element, element).
def with_previous(iterable):
iterator = iter(iterable)
previous_element = next(iterator)
for element in iterator:
yield previous_element, element
previous_element = element
You need to handle the special cases if the iterable contains only one or two elements.

How to use Python to read a txt file line by line within a particular range while ignoring empty lines?

I tried to read a txt file line by line for 10 lines, starting from a certain string, and ignoring empty lines. Here's the code I used:
a =[]
file1 = open('try2.txt', 'r')
for line in file1:
if line.startswith('Merl'):
for line in range(10):
if line != '\n':
a.append(next(file1))
print(a)
But the output still included empty lines. Any suggestions please?
The problem occures because you check if line equals '\n' but you append the next line. The solution will be to append the current line, and then call next(file1).
a = []
file1 = open('try2.txt', 'r')
for line in file1:
if line.startswith('Merl'):
for i in range(10):
if line != '\n':
a.append(line)
line = next(file1)
print(a)
If I understood correctly you only wanted to look at the first 10 lines or? Then try the following:
a = []
file1 = open('try2.txt', 'r')
counter = 0
for line in file1:
counter +=1
if counter > 10:
break
if line.startswith('Merl'):
if line != '\n':
a.append(next(file1))
print(a)

Fastest way to convert files into lists?

I have a .txt file which contains some words:
e.g
bye
bicycle
bi
cyc
le
and i want to return a list which contains all the words in the file. I have tried some code which actually works but i think it takes a lot of time to execute for bigger files. Is there a way to make this code more efficient?
with open('file.txt', 'r') as f:
for line in f:
if line == '\n': --> #blank line
lst1.append(line)
else:
lst1.append(line.replace('\n', '')) --> #the way i find more efficient to concatenate letters of a specific word
str1 = ''.join(lst1)
lst_fin = str1.split()
expected output:
lst_fin = ['bye', 'bicycle', 'bicycle']
I don't know if this is more efficient, but at least it's an alternative... :)
with open('file.txt') as f:
words = f.read().replace('\n\n', '|').replace('\n', '').split('|')
print(words)
...or if you don't want to insert a character like '|' (which could be already there) into the data you could do also
with open('file.txt') as f:
words = f.read().split('\n\n')
words = [w.replace('\n', '') for w in words]
print(words)
result is the same in both cases:
# ['bye', 'bicycle', 'bicycle']
EDIT:
I think I have another approach. However, it requires the file not to start with a blank line, iiuc...
with open('file.txt') as f:
res = []
current_elmnt = next(f).strip()
for line in f:
if line.strip():
current_elmnt += line.strip()
else:
res.append(current_elmnt)
current_elmnt = ''
print(words)
Perhaps you want to give it a try...
You can use the iter function with a sentinel of '' instead:
with open('file.txt') as f:
lst_fin = list(iter(lambda: ''.join(iter(map(str.strip, f).__next__, '')), ''))
Demo: https://repl.it/#blhsing/TalkativeCostlyUpgrades
You could use this(I don't know about its efficiency):
lst = []
s = ''
with open('tp.txt', 'r') as file:
l = file.readlines()
for i in l:
if i == '\n':
lst.append(s)
s = ''
elif i == l[-1]:
s += i.rstrip()
lst.append(s)
else:
s+= i.rstrip()
print(lst)

Indexing lines in a Python file

I want to open a file, and simply return the contents of said file with each line beginning with the line number.
So hypothetically if the contents of a is
a
b
c
I would like the result to be
1: a
2: b
3: c
Im kind of stuck, tried enumerating but it doesn't give me the desired format.
Is for Uni, but only a practice test.
A couple bits of trial code to prove I have no idea what I'm doing / where to start
def print_numbered_lines(filename):
"""returns the infile data with a line number infront of the contents"""
in_file = open(filename, 'r').readlines()
list_1 = []
for line in in_file:
for item in line:
item.index(item)
list_1.append(item)
return list_1
def print_numbered_lines(filename):
"""returns the infile data with a line number infront of the contents"""
in_file = open(filename, 'r').readlines()
result = []
for i in in_file:
result.append(enumerate(i))
return result
A file handle can be treated as an iterable.
with open('tree_game2.txt') as f:
for i, line in enumerate(f):
print ("{0}: {1}".format(i+1,line))
There seems no need to write a python script, awk would solve your problem.
awk '{print NR": "$1}' your_file > new_file
What about using an OrderedDict
from collections import OrderedDict
c = OrderedDict()
n = 1
with open('file.txt', 'r') as f:
for line in f:
c.update({n:line})
#if you just want to print it, skip the dict part and just do:
print n,line
n += 1
Then you can print it out with:
for n,line in c.iteritems(): #.items() if Python3
print k,line
the simple way to do it:
1st:with open the file -----2ed:using count mechanism:
for example:
data = object of file.read()
lines = data.split("\n")
count =0
for line in lines:
print("line "+str(count)+">"+str()+line)
count+=1

Import textfile - list index out of range

infile = open("/Users/name/Downloads/points.txt", "r")
line = infile.readline()
while line != "":
line = infile.readline()
wordlist = line.split()
x_co = float(wordlist[0])
y_co = float(wordlist[1])
I looked around but actually didn't find something helpful for my problem.
I have a .txt file with x (first column) and y (second column) coordinates (see picture).
I want every x and y coordinate separated but when I run my code I always get an ERROR:
x_co = float(wordList[0])
IndexError: list index out of range
Thanks for helping!
filename = "/Users/name/Downloads/points.txt"
with open(filename) as infile:
for line in infile:
wordlist = line.split()
x_co = float(wordlist[0])
y_co = float(wordlist[1])
with automatically handles file closing
For more such idiomatic ways in Python, read this
Better you can do this way:
infile = open("/Users/name/Downloads/points.txt", "r")
for line in infile:
if line:
wordlist = line.split()
x_co = float(wordlist[0])
y_co = float(wordlist[1])

Categories