So I have a text file that has writing in it and I created a for loop which finds a specified string in the file and prints out the lines that have the string contained in it. But now I'm stuck because I want to modify the code so I can write a new file that contains what it already printed out.
I've tried researching around for answers, but I can't seem to find any solutions or how to even search for what I am trying to do. I tried carefully looking at the parameters of print function and join method.
file = open("datalist.txt", "r")
s = "hello"
file_export = open("newfile.txt", "w")
for lines in file:
lines = lines.lower()
index = lines.find(s)
if index != -1:
indexed = lines[index:]
print(lines[index:], end='')
The printed message I need is something along the lines of:
hello,
hello:
hello;
print is not the function you are looking for, to write into a file you need to use file.write (or file.writeline), take a look to the input and output documentation.
To give an idea, your code should be something like this:
file = open("datalist.txt", "r")
s = "hello"
file_export = open("newfile.txt", "w")
for lines in file:
lines = lines.lower()
index = lines.find(s)
if index != -1:
indexed = lines[index:]
print(lines[index:], end='')
file_export.write(lines[index:])
Also, note that you should close your file when you are done, so add the following at the end:
file.close()
file_export.close()
Or, as an alternative, use the context manager that automatically closes your file.
Related
I have a code that reads multiple text files and print the last line.
from glob import glob
text_files = glob('C:/Input/*.txt')
for file_name in text_files:
with open(file_name, 'r+') as f:
lines = f.read().splitlines()
last_line = lines[-3]
print (last_line)
I want to redirect the print to an output txt file , so that i will check the sentence . Also the txt files has multiple lines of space . I want to delete all the empty lines and get the last line of the file to an output file. When i try to write it is writing only the last read file. Not all files last line is written .
Can someone help ?
Thanks,
Aarush
I think you have two separate questions.
Next time you use stack overflow, if you have multiple questions, please post them separately.
Question 1
How do I re-direct the output from the print function to a file?
For example, consider a hello world program:
print("hello world")
How do we create a file (named something like text_file.txt) in the current working directory, and output the print statements to that file?
ANSWER 1
Writing output from the print function to a file is simple to do:
with open ('test_file.txt', 'w') as out_file:
print("hello world", file=out_file)
Note that print function accepts a special keyword-argument named "file"
You must write file=f in order to pass f as input to the print function.
QUESTION 2
How do I get the last non-blank line from s file? I have an input file which has lots of line-feeds, carriage-returns, and space characters at the end of. We need to ignore blank lines, and retrieve the last lien of the file which contains at least one character which is not a white-space character.
Answer 2
def get_last_line(file_stream):
for line in map(str, reversed(iter(file_stream))):
# `strip()` removes all leading a trailing white-space characters
# `strip()` removes `\n`, `\r`, `\t`, space chars, etc...
line = line.strip()
if len(line) > 0:
return line
# if the file contains nothing but blank lines
# return the empty string
return ""
You can process multiple files like so:
file_names = ["input_1.txt", "input_2.txt", "input_3.txt"]
with open ('out_file.txt', 'w') as out_file:
for file_name in file_names:
with open(file_name, 'r') as read_file:
last_line = get_last_line(read_file)
print (last_line, file=out_file)
Instead of just print, do something like this:
print(last_line)
with open('output.txt', 'w') as fout:
fout.write(last_line)
Or you could also append to the file!
text_file.txt
I am getting the output for first print statement but not for second print statement.Please sugget me the correct code is there anything i have to encode or decode? please help me i m new to python3
Here's a more straightforward implementation of what you're trying to achieve. You can read the file into a Python list and reference each line by a Python list index
with open('text_file.txt','r') as f: # automatically closes the file
input_file = f.readlines() # Read all lines into a Python list
for line_num in range(len(input_file)):
if "INBOIS BERCUKAI" in input_file[line_num]:
print(input_file[line_num + 2]) # offset by any number you want
# same for other if statements
I am writing in python 3.6 and am having trouble making my code match strings in a short text document. this is a simple example of the exact logic that is breaking my bigger program:
PATH = "C:\\Users\\JoshLaptop\\PycharmProjects\\practice\\commented.txt"
file = open(PATH, 'r')
words = ['bah', 'dah', 'gah', "fah", 'mah']
print(file.read().splitlines())
if 'bah' not in file.read().splitlines():
print("fail")
with the text document formatted like so:
bah
gah
fah
dah
mah
and it is indeed printing out fail each time I run this. Am I using the incorrect method of reading the data from the text document?
the issue is that you're printing print(file.read().splitlines())
so it exhausts the file, and the next call to file.read().splitlines() returns an empty list...
A better way to "grep" your pattern would be to iterate on the file lines instead of reading it fully. So if you find the string early in the file, you save time:
with open(PATH, 'r') as f:
for line in f:
if line.rstrip()=="bah":
break
else:
# else is reached when no break is called from the for loop: fail
print("fail")
The small catch here is not to forget to call line.rstrip() because file generator issues the line with the line terminator. Also, if there's a trailing space in your file, this code will still match the word (make it strip() if you want to match even with leading blanks)
If you want to match a lot of words, consider creating a set of lines:
lines = {line.rstrip() for line in f}
so your in lines call will be a lot faster.
Try it:
PATH = "C:\\Users\\JoshLaptop\\PycharmProjects\\practice\\commented.txt"
file = open(PATH, 'r')
words = file.read().splitlines()
print(words)
if 'bah' not in words:
print("fail")
You can't read the file two times.
When you do print(file.read().splitlines()), the file is read and the next call to this function will return nothing because you are already at the end of file.
PATH = "your_file"
file = open(PATH, 'r')
words = ['bah', 'dah', 'gah', "fah", 'mah']
if 'bah' not in (file.read().splitlines()) :
print("fail")
as you can see output is not 'fail' you must use one 'file.read().splitlines()' in code or save it in another variable otherwise you have an 'fail' message
I am trying to remove duplicates of 3-column tab-delimited txt file, but as long as the first two columns are duplicates, then it should be removed even if the two has different 3rd column.
from operator import itemgetter
import sys
input = sys.argv[1]
output = sys.argv[2]
#Pass any column number you want, note that indexing starts at 0
ig = itemgetter(0,1)
seen = set()
data = []
for line in input.splitlines():
key = ig(line.split())
if key not in seen:
data.append(line)
seen.add(key)
file = open(output, "w")
file.write(data)
file.close()
First, I get error
key = ig(line.split())
IndexError: list index out of range
Also, I can't see how to save the result to output.txt
People say saving to output.txt is a really basic matter. But no tutorial helped.
I tried methods that use codec, those that use with, those that use file.write(data) and all didn't help.
I could learn MatLab quite easily. The online tutorial was fantastic and a series of Googling always helped a lot.
But I can't find a helpful tutorial of Python yet. This is obviously because I am a complete novice. For complete novices like me, what would be the best tutorial with 1) comprehensiveness AND 2) lots of examples 3) line by line explanation that dosen't leave any line without explanation?
And why is the above code causing error and not saving result?
I'm assuming since you assign input to the first command line argument with input = sys.argv[1] and output to the second, you intend those to be your input and output file names. But you're never opening any file for the input data, so you're callling .splitlines() on a file name, not on file contents.
Next, splitlines() is the wrong approach here anyway. To iterate over a file line-by-line, simply use for line in f, where f is an open file. Those lines will include the newline at the end of the line, so it needs to be stripped if it's not supposed to be part of the third columns data.
Then you're opening and closing the file inside your loop, which means you'll try to write the entire contents of data to the file every iteration, effectively overwriting any data written to the file before. Therefore I moved that block out of the loop.
It's good practice to use the with statement for opening files. with open(out_fn, "w") as outfile will open the file named out_fn and assign the open file to outfile, and close it for you as soon as you exit that indented block.
input is a builtin function in Python. I therefore renamed your variables so no builtin names get shadowed.
You're trying to directly write data to the output file. This won't work since data is a list of lines. You need to join those lines first in order to turn them in a single string again before writing it to a file.
So here's your code with all those issues addressed:
from operator import itemgetter
import sys
in_fn = sys.argv[1]
out_fn = sys.argv[2]
getkey = itemgetter(0, 1)
seen = set()
data = []
with open(in_fn, 'r') as infile:
for line in infile:
line = line.strip()
key = getkey(line.split())
if key not in seen:
data.append(line)
seen.add(key)
with open(out_fn, "w") as outfile:
outfile.write('\n'.join(data))
Why is the above code causing error?
Because you haven't opened the file, you are trying to work with the string input.txtrather than with the file. Then when you try to access your item, you get a list index out of range because line.split() returns ['input.txt'].
How to fix that: open the file and then work with it, not with its name.
For example, you can do (I tried to stay as close to your code as possible)
input = sys.argv[1]
infile = open(input, 'r')
(...)
lines = infile.readlines()
infile.close()
for line in lines:
(...)
Why is this not saving result?
Because you are opening/closing the file inside the loop. What you need to do is write the data once you're out of the loop. Also, you cannot write directly a list to a file. Hence, you need to do something like (outside of your loop):
outfile = open(output, "w")
for item in data:
outfile.write(item)
outfile.close()
All together
There are other ways of reading/writing files, and it is pretty well documented on the internet but I tried to stay close to your code so that you would understand better what was wrong with it
from operator import itemgetter
import sys
input = sys.argv[1]
infile = open(input, 'r')
output = sys.argv[2]
#Pass any column number you want, note that indexing starts at 0
ig = itemgetter(0,1)
seen = set()
data = []
lines = infile.readlines()
infile.close()
for line in lines:
print line
key = ig(line.split())
if key not in seen:
data.append(line)
seen.add(key)
print data
outfile = open(output, "w")
for item in data:
outfile.write(item)
outfile.close()
PS: it seems to produce the result that you needed there Python to remove duplicates using only some, not all, columns
I'm a beginner at this, and not really sure how to accomplish it. Basically, I need to be able to open a file (user inputs the name, and a starting and ending address). I need to be able to search through the file and find the starting address, and only copy the information from the given starting address to the ending address, but after looking through multiple examples on this site, I can't seem to get it to function. My code for now (with many tests to make sure it can actually run is):
import pickle, pprint
def openFileandRead():
filename = raw_input("Please enter your filename:\n")
openedFile = open(filename, "rb")
startAddress = raw_input("From the EEPROM file, enter 'Start Address':\n")
endAddress = raw_input("From the EEPROM file, enter 'End Address':\n")
with open(filename, 'rb') as searchfile:
for line in searchfile:
if 'searchphrase' in line:
print line
#this does nothing...but i'm not sure why...
with openedFile as fp:
for line in iter(fp.readline, ''):
process_line(line)
print line
#nothing as well...
with open(filename) as f:
for line in f:
print line
#more nothing
newFilename = '"'+ filename + '"'
print newFilename
print openedFile.tell
print openedFile.name
print openedFile.mode
print openedFile.softspace
print openedFile.encoding
print type(openedFile)
x = openedFile.readline()
for line in x:
return x
print x
#again, nothing
Sorry if I don't need a lot of it...but I don't understand why nothing works minus the print statements. I'm barely sure how to actually search for the start/end address to use them to copy that information either. Any help would be useful, thanks :)
Well, let's look through your code:
filename = raw_input("Please enter your filename:\n")
openedFile = open(filename, "rb")
startAddress = raw_input("From the EEPROM file, enter 'Start Address':\n")
endAddress = raw_input("From the EEPROM file, enter 'End Address':\n")
This is mostly correct, but keep in mind:
You don't need the '\n' in the raw_input() calls unless you really want the user input to be on the next line. (Just a nitpick.)
startAddress and endAddress should really be integers, right? (or, well, naturals, I suppose.) So you need to use int to convert them to numbers.
Next section:
with open(filename, 'rb') as searchfile:
for line in searchfile:
if 'searchphrase' in line:
print line
In the previous section of code, your goal seems to be to extract a slice of the file's contents. However, in this part, you're iterating through each line and checking if 'searchphrase' is present. If so, you're printing the line. So this code isn't helping you achieve your goal, is it? The next with block is calling an unknown function, so I can't help there. The final block should print each line - does your file even have content?
To get to a particular part of the file, you want to use file.seek(). See the linked Python documentation.
Finally, in your last loop:
x = openedFile.readline()
for line in x:
return x
print x
You're reading one line, then iterating through each character in that line, then returning the first character, ending your function. You may not want that return there.