I have a code that reads multiple text files and print the last line.
from glob import glob
text_files = glob('C:/Input/*.txt')
for file_name in text_files:
with open(file_name, 'r+') as f:
lines = f.read().splitlines()
last_line = lines[-3]
print (last_line)
I want to redirect the print to an output txt file , so that i will check the sentence . Also the txt files has multiple lines of space . I want to delete all the empty lines and get the last line of the file to an output file. When i try to write it is writing only the last read file. Not all files last line is written .
Can someone help ?
Thanks,
Aarush
I think you have two separate questions.
Next time you use stack overflow, if you have multiple questions, please post them separately.
Question 1
How do I re-direct the output from the print function to a file?
For example, consider a hello world program:
print("hello world")
How do we create a file (named something like text_file.txt) in the current working directory, and output the print statements to that file?
ANSWER 1
Writing output from the print function to a file is simple to do:
with open ('test_file.txt', 'w') as out_file:
print("hello world", file=out_file)
Note that print function accepts a special keyword-argument named "file"
You must write file=f in order to pass f as input to the print function.
QUESTION 2
How do I get the last non-blank line from s file? I have an input file which has lots of line-feeds, carriage-returns, and space characters at the end of. We need to ignore blank lines, and retrieve the last lien of the file which contains at least one character which is not a white-space character.
Answer 2
def get_last_line(file_stream):
for line in map(str, reversed(iter(file_stream))):
# `strip()` removes all leading a trailing white-space characters
# `strip()` removes `\n`, `\r`, `\t`, space chars, etc...
line = line.strip()
if len(line) > 0:
return line
# if the file contains nothing but blank lines
# return the empty string
return ""
You can process multiple files like so:
file_names = ["input_1.txt", "input_2.txt", "input_3.txt"]
with open ('out_file.txt', 'w') as out_file:
for file_name in file_names:
with open(file_name, 'r') as read_file:
last_line = get_last_line(read_file)
print (last_line, file=out_file)
Instead of just print, do something like this:
print(last_line)
with open('output.txt', 'w') as fout:
fout.write(last_line)
Or you could also append to the file!
Related
I am writing in python 3.6 and am having trouble making my code match strings in a short text document. this is a simple example of the exact logic that is breaking my bigger program:
PATH = "C:\\Users\\JoshLaptop\\PycharmProjects\\practice\\commented.txt"
file = open(PATH, 'r')
words = ['bah', 'dah', 'gah', "fah", 'mah']
print(file.read().splitlines())
if 'bah' not in file.read().splitlines():
print("fail")
with the text document formatted like so:
bah
gah
fah
dah
mah
and it is indeed printing out fail each time I run this. Am I using the incorrect method of reading the data from the text document?
the issue is that you're printing print(file.read().splitlines())
so it exhausts the file, and the next call to file.read().splitlines() returns an empty list...
A better way to "grep" your pattern would be to iterate on the file lines instead of reading it fully. So if you find the string early in the file, you save time:
with open(PATH, 'r') as f:
for line in f:
if line.rstrip()=="bah":
break
else:
# else is reached when no break is called from the for loop: fail
print("fail")
The small catch here is not to forget to call line.rstrip() because file generator issues the line with the line terminator. Also, if there's a trailing space in your file, this code will still match the word (make it strip() if you want to match even with leading blanks)
If you want to match a lot of words, consider creating a set of lines:
lines = {line.rstrip() for line in f}
so your in lines call will be a lot faster.
Try it:
PATH = "C:\\Users\\JoshLaptop\\PycharmProjects\\practice\\commented.txt"
file = open(PATH, 'r')
words = file.read().splitlines()
print(words)
if 'bah' not in words:
print("fail")
You can't read the file two times.
When you do print(file.read().splitlines()), the file is read and the next call to this function will return nothing because you are already at the end of file.
PATH = "your_file"
file = open(PATH, 'r')
words = ['bah', 'dah', 'gah', "fah", 'mah']
if 'bah' not in (file.read().splitlines()) :
print("fail")
as you can see output is not 'fail' you must use one 'file.read().splitlines()' in code or save it in another variable otherwise you have an 'fail' message
I have three short JSON text files. I want to combine them with Python, and as far as it works and creates an output file with everything on the right place, on the last line I have a comma, and I would like to replace it with } . I have came up with such a code:
def join_json_file (file_name_list,output_file_name):
with open(output_file_name,"w") as file_out:
file_out.write('{')
for filename in file_name_list:
with open(filename) as infile:
file_out.write(infile.read()[1:-1] + ",")
with open(output_file_name,"r") as file_out:
lines = file_out.readlines()
print lines[-1]
lines[-1] = lines[-1].replace(",","")
but it doesn't replace the last line. Could somebody help me? I am new to Python and I can't find the solution by myself.
You are writing all of the files, and then loading it back in to change the last line. The change though will only be in memory, not in the file itself. The better approach would be to avoid writing the extra , in the first place. For example:
def join_json_file (file_name_list, output_file_name):
with open(output_file_name, "w") as file_out:
file_out.write('{')
for filename in file_name_list[:-1]:
with open(filename) as infile:
file_out.write(infile.read()[1:-1] + ",")
with open(file_name_list[-1]) as infile:
file_out.write(infile.read()[1:-1])
This first writes all but the last file with the extra comma, and then writes the last file seperately. You might also want to check for the case of a single file.
Would like to use a text file containing multiple windows path/filename.txt files and feed into a for loop which would then take each path leading to a filename.txt and search it for a word. This would happen for each path to a file in the filename.txt file.
So far this part is functioning:
with open ("filename.txt", "r") as myfile:
data=myfile.read()
print (data)
Printing the data gives me this:
The results of printing out the contents of the variable, "data" looks like:
c:/temp\Txt_folder\3rd_lyr_fldr\3rd_infiles.txt
c:/temp\Txt_folder\3rd_lyr_fldr\3rd_ListFile.txt
c:/temp\Txt_folder\3rd_lyr_fldr\3rd_new_filename1.txt
This part of script ,shown below, does not work. The data shown above is not fed into the for loop (shown below) one line at a time but rather one continuous column or at least
that is the way print(data) shows it on my screen.
for line in data:
if re.search(r"something",line):
print(line)
How can this me accomplished.
Here's something that basically does what you want:
keyword = 'whatever'
with open ('filename.txt', 'rt') as myfile:
for filename in (line.strip() for line in myfile):
with open(filename, 'rt') as file:
for line in file:
if keyword in line:
print(line, end='')
I'm using 2 files
1° Is findjava.py, this file outputs all java file names in a directory, separated by \n
2° countfile receives 1 single filename and counts its lines
I'm already receiving an string with the filenames in count file ( javafile1\njavafile2\njavafile3\n)
How could I run a loop to go trough all those file names one by one?
I'd need to read that string till it finds a \n, then use that part as a variable to run my script to count the lines, and then keep reading the next file name.
Split on \n
So something like
files = "javafile1\njavafile2\njavafile3\n"
list_of_files = files.split("\n")
for file in list_of_files:
with open(file) as fh:
lines = fh.readlines()
You don't leave any example data, unless it's somehow complicated by other factors it seems you can split() the data without further ado:
files = "javafile1\njavafile2\njavafile3\n"
for name in files.split():
print "counting lines in file " + name
countfile(name)
In python you can use a for loop to access a file one line at a time:
with open("file_list") as stream:
for filename in stream:
with open(filename) as f:
for line in f:
lines += 1
print f, lines
How to remove whitespaces in the beginning of every string in a file with python?
I have a file myfile.txt with the strings as shown below in it:
_ _ Amazon.inc
Arab emirates
_ Zynga
Anglo-Indian
Those underscores are spaces.
The code must be in a way that it must go through each and every line of a file and remove all those whitespaces, in the beginning of a line.
I've tried using lstrip but that's not working for multiple lines and readlines() too.
Using a for loop can make it better?
All you need to do is read the lines of the file one by one and remove the leading whitespace for each line. After that, you can join again the lines and you'll get back the original text without the whitespace:
with open('myfile.txt') as f:
line_lst = [line.lstrip() for line in f.readlines()]
lines = ''.join(line_lst)
print lines
Assuming that your input data is in infile.txt, and you want to write this file to output.txt, it is easiest to use a list comprehension:
inf = open("infile.txt")
stripped_lines = [l.lstrip() for l in inf.readlines()]
inf.close()
# write the new, stripped lines to a file
outf = open("output.txt", "w")
outf.write("".join(stripped_lines))
outf.close()
To read the lines from myfile.txt and write them to output.txt, use
with open("myfile.txt") as input:
with open("output.txt", "w") as output:
for line in input:
output.write(line.lstrip())
That will make sure that you close the files after you're done with them, and it'll make sure that you only keep a single line in memory at a time.
The above code works in Python 2.5 and later because of the with keyword. For Python 2.4 you can use
input = open("myfile.txt")
output = open("output.txt", "w")
for line in input:
output.write(line.lstrip())
if this is just a small script where the files will be closed automatically at the end. If this is part of a larger program, then you'll want to explicitly close the files like this:
input = open("myfile.txt")
try:
output = open("output.txt", "w")
try:
for line in input:
output.write(line.lstrip())
finally:
output.close()
finally:
input.close()
You say you already tried with lstrip and that it didn't work for multiple lines. The "trick" is to run lstrip on each individual line line I do above. You can try the code out online if you want.