Print lines from file between header and empty line - python

I have a file with the following structure: header - data - blank lines - header ... I am trying to print the lines of my datafile between the header and the first blank line. The piece of code I am using doesn't seem to work; do you have any suggestion?
for j in range(0, number_of_angles, 1):
start_line = omega_line_number[j]+5 #start line is the line number after the header
print start_line
for line in range(start_line,num_lines,1): #num_lines is the total number of lines
stripped = line.strip()
if not stripped:
break
print line

I wrote a quick program that works on a text file I made for this. Basically, the text file contains 10 lines containing the word "header" and 20 lines containing the word "body". After that, I have 10 blank lines and repeat the first 30 lines. Here is the code that prints only the body.
if __name__ == '__main__':
# Open the file for reading.
rd_file_name = "../txt/q1.txt"
rd_file = open(rd_file_name, 'r')
# Read through the header
for line in rd_file.readlines():
# Decide what to do based on the content in the line.
if "header" in line.lower():
# Don't print the header
pass
elif line.strip() == "":
# Quit if you see a blank line
break
else:
# We print the body. Lines end with a carriage return, so we don't
# add another one.
print line,
# Clean up
rd_file.close()

line is going to be an integer, and doesn't have a method, strip()

Related

Remove text lines and strip lines with condition in python

I have a text file in this format:
000000.png 712,143,810,307,0
000001.png 599,156,629,189,3 387,181,423,203,1 676,163,688,193,5
000002.png 657,190,700,223,1
000003.png 614,181,727,284,1
000004.png 280,185,344,215,1 365,184,406,205,1
I want to remove the lines that don't have a [number1,number2,number3,number4,1] or [number1,number2,number3,number4,5] ending and also strip the text line and remove the [blocks] -> [number1,number2,number3,number4,number5] that don't fulfill this condition.
The above text file should look like this in the end:
000001.png 387,181,423,203,1 676,163,688,193,5
000002.png 657,190,700,223,1
000003.png 614,181,727,284,1
000004.png 280,185,344,215,1 365,184,406,205,1
My code:
import os
with open("data.txt", "r") as input:
with open("newdata.txt", "w") as output:
# iterate all lines from file
for line in input:
# if substring contain in a line then don't write it
if ",0" or ",2" or ",3" or ",4" or ",6" not in line.strip("\n"):
output.write(line)
I have tried something like this and it didn't work obviously.
No need for Regex, this might help you:
with open("data.txt", "r") as input: # Read all data lines.
data = input.readlines()
with open("newdata.txt", "w") as output: # Create output file.
for line in data: # Iterate over data lines.
line_elements = line.split() # Split line by spaces.
line_updated = [line_elements[0]] # Initialize fixed line (without undesired patterns) with image's name.
for i in line_elements[1:]: # Iterate over groups of numbers in current line.
tmp = i.split(',') # Split current group by commas.
if len(tmp) == 5 and (tmp[-1] == '1' or tmp[-1] == '5'):
line_updated.append(i) # If the pattern is Ok, append group to fixed line.
if len(line_updated) > 1: # If the fixed line is valid, write it to output file.
output.write(f"{' '.join(line_updated)}\n")

How to format a text file such that it removes blank files and trailing space

I have test.txt that looks like the screenshot
PS: There are trailing spaces in the second line, so it is < space > line 2
in result we have to get:
line 1
line 2
line 3
This is what I have so far
with open("test", 'r+') as fd:
lines = fd.readlines()
fd.seek(0)
fd.writelines(line for line in lines if line.strip())
fd.truncate()
But it is not handling cases when the line starts with space (in the example, line 2) , How do I modify my code? I want to us Python
Test.txt file:
first line
second line
third line
Python Code:
#// Imports
import os
#// Global Var's
fileName:str = "test.txt"
#// Logic
def FileCorection(file:str):
try:
#// Read original file
with open(file, "r") as r:
#// Write a temporary file
with open(f"temp_{file}", "w") as w:
# Get line from original file
line = r.readline()
# While we still have lines
while line:
# Make a temporary line with out spaces at the end and also at the front of the line (in case they are)
tempLine:str = line.strip()
#// Check if the line is empty
if tempLine == "":
Line:tuple = (False, "Empty line...")
#// If not then get the line
else:
Line:tuple = (True, tempLine)
#// Print/Show Line if is True... in this case you care set witch line to pre writed in a new file
if Line[0] == True:
print(Line[1])
w.write(f"{Line[1]}\n")
line = r.readline()
finally:
# Make shore the files are closed
# By default they shood but just to make shore
r.close()
w.close()
# Now repalce the temp file with the original one
# By replaceing we delete the original one and we rename the temporary one with the same name
os.remove(file)
os.rename(f"temp_{file}", file)
if __name__ == "__main__":
FileCorection(fileName)
# Show when is done
print(">> DONE!")
Console out:
first line
second line
third line
>> DONE!
Process finished with exit code 0
P.S.: The code was updated/optimized!
I would suggest formatting the input(screenshot of the text file would do). Assuming your input looks like this you can use strip when text begins with a space.
#Code
with open(r"demo.txt","r") as f:
data = f.read()
data_list = [s.strip() for s in data.split("\n") if len(s)>0]
print("\n".join(data_list))

Python Loop Only Parses First Five Lines of Text File

I'm attempting to read and print a .txt file line-by-line in Python using the readline function. The below code is intended to print out the entire text file line-by-line, but as of right now, it only prints out the first five lines of the text file.
filename = input("Enter the name and extension of the file you want to open.")
file = open(filename, "r")
fileline = file.readline()
for line in fileline:
fileline = fileline.rstrip("\n")
print(fileline)
fileline = file.readline()
file.close()
I expect the code to print out the entire file line by line, but it currently only prints out the first five lines. What is the bug in the code?
This line:
for line in fileline:
is looping through the characters of fileline, which contains the first line of the file. So if the first line has 5 characters, this loop will execute 5 times.
Then inside the loop, you print the line and then read the next line into the fileline variable. That has no effect on the loop, which is still iterating over the characters in the first line.
To make the program deliberately print the first 5 lines, you can do:
for i in range(5):
fileline = file.readline()
if (fileline == ''): #end of file reached
break
print(fileline.rtrim('\n'))
Or you can iterate over file, which automatically reads lines, and use a separate counter variable
i = 0
for line in file:
print(line.rtrim('\n'))
i += 1
if i == 5:
break

Splitting a ".txt" file

I have data in a .txt file, in the form a of a comma separated list. For example:
N12345678,B,A,D,D,C,B,D,A,C,C,D,B,A,B,A,C,B,D,A,C,A,A,B,D,D
N12345678,B,A,D,D,C,B,D,A,C,C,D,B,A,B,A,C,B,D,A,C,A,A,B,D,D
I want to be able to split it up, first by line, then by comma, so I'm able to process the data and validate it. I am getting "invalid" for all of lines in my code, even though some of them should be valid because there should be 26 characters per line. Here is my code so far:
(filename+".txt").split("\n")
(filename+".txt").split(",")
with open(filename+".txt") as f:
for line in f:
if len(line) != 26:
print ("invalid")
else:
print ("valid")
This code is quite far from working; it's syntactically valid Python, but it doesn't mean anything sensible.
# These two lines add two strings together, returning a string
# then they split the string into pieces into a list
# because the /filename/ has no newlines in it, and probably no commas
# that changes nothing
# then the return value isn't saved anywhere, so it gets thrown away
(filename+".txt").split("\n")
(filename+".txt").split(",")
# This opens the file and reads from it line by line,
# which means "line" is a string of text for each line in the file.
with open(filename+".txt") as f:
for line in f:
# This checks if the line in the file is not the /number/ 26
# since the file contains /strings/ it never will be the number 26
if line != 26:
print ("invalid")
# so this is never hit
else:
print ("valid")
[Edit: even in your updated code, the line is the whole text "N12345678,B,A,D..." and because of the commas, len(line) will be longer than 26 characters.]
It seems you want something more like: Drop the first two lines of your code completely, read through the file line by line (meaning you normally don't have to care about "\n" in your code). Then split each line by commas.
with open(filename+".txt") as f:
for line in f:
line_parts = line.split(",")
if len(line_parts) != 26:
print ("invalid")
else:
print ("valid")
# line_parts is now a list of strings
# ["N12345678" ,"B", "A", ...]
I think an easier way to do this would be to use csv module.
import csv
with open("C:/text.csv") as input:
reader = csv.reader(input)
for row in reader:
if len(row) == 26:
print("Valid")
else:
print("Invalid")
As far as I understand your question, you want this.
with open(filename, 'r') as f:
for line in f:
if len(line.split(',')) !=26:
print("Invalid")
else:
print("Valid")
All it does is,
Open the file.
Read the file line by line.
For each line, split the line by ,
As str.split() returns a list, check if length of the list is 26 or not.
If length is 26, count as valid; otherwise not.

How to only read a file between certain phrases

Just a basic question. I know how to read information from a file etc but how would I go about only including the lines that are in between certain lines?
Say I have this :
Information Included in file but before "beginning of text"
" Beginning of text "
information I want
" end of text "
Information included in file but after the "end of text"
Thank you for any help you can give to get me started.
You can read the file in line by line until you reach the start-markerline, then do something with the lines (print them, store them in a list, etc) until you reach the end-markerline.
with open('myfile.txt') as f:
line = f.readline()
while line != ' Beginning of text \n':
line = f.readline()
while line != ' end of text \n':
# add code to do something with the line here
line = f.readline()
Make sure to exactly match the start- and end-markerlines. In your example they have a leading and trailing blank.
Yet another way to do it, is to use two-argument version of iter():
start = '" Beginning of text "\n'
end = '" end of text "\n'
with open('myfile.txt') as f:
for line in iter(f.readline, start):
pass
for line in iter(f.readline, end):
print line
see https://docs.python.org/2/library/functions.html#iter for details
I would just read the file line by line and check each line if it matches beginning or end string. The boolean readData then indicates if you are between beginning and end and you can read the actual information to another variable.
# Open the file
f = open('myTextFile.txt')
# Read the first line
line = f.readline()
readData=false;
# If the file is not empty keep reading line one at a time
# until the file is empty
while line:
# Check if line matches beginning
if line == "Beginning of text":
readData=true;
# Check if line matches end
if line == "end of text"
readData=false;
# We are between beginning and end
if readData:
(...)
line = f.readline()
f.close()

Categories