Search text file and save result to another text file - python

I am pretty new to python and have only very limited programming skills with it. I hope you can help me here.
I have a large text file and I'm searching it for a specific word. Every line with this word needs to be stored to another txt file.
I can search the file and print the result in the console but not to a different file. How can I manage that?
f = open("/tmp/LostShots/LostShots.txt", "r")
searchlines = f.readlines()
f.close()
for i, line in enumerate(searchlines):
if "Lost" in line:
for l in searchlines[i:i+3]: print l,
print
f.close()
Thx
Jan

Use with context manager, do not use readlines() since it will read the whole contents of a file into a list. Instead iterate over file object line by line and see if a specific word is there; if yes - write to the output file:
with open("/tmp/LostShots/LostShots.txt", "r") as input_file, \
open('results.txt', 'w') as output_file:
for line in input_file:
if "Lost" in line:
output_file.write(line)
Note that for python < 2.7, you cannot have multiple items in with:
with open("/tmp/LostShots/LostShots.txt", "r") as input_file:
with open('results.txt', 'w') as output_file:
for line in input_file:
if "Lost" in line:
output_file.write(line)

To correctly match words in general, you need regular expressions; a simple word in line check also matches blablaLostblabla which I assume you don't want:
import re
with open("/tmp/LostShots/LostShots.txt", "r") as input_file, \
open('results.txt', 'w') as output_file:
output_file.writelines(line for line in input_file
if re.match(r'.*\bLost\b', line)
or you can use a more wordy
for line in input_file:
if re.match(r'.*\bLost\b', line)):
output_file.write(line)
As a side note, you should be using os.path.join to make paths; also, for working with temporary files in a cross-platform manner, see the functions in the tempfile module.

Related

Meditation with texts in a text file in the case of threading

iam using this code to to pull the first line at text file at threading mod before delete it from the file
with open(r'C:\datanames\names.txt','r') as fin:
name = fin.readline()
with open(r'C:\datanames\names.txt', 'r') as fin:
data = fin.read().splitlines(True)
with open(r'C:\datanames\names.txt', 'w') as fout:
fout.writelines(data[1:])
put it make me lose the data Often
Is there a more efficient and practical way to use it in such a situation? (threading)
I see no reason to use threading for this. It's very straightforward.
To remove the first line from a file do this:
FILENAME = 'foo.txt'
with open(FILENAME, 'r+') as file:
lines = file.readlines()
file.seek(0)
file.writelines(lines[1:])
file.truncate()

Insert new data for each text line in python

I have a text file that looks like this:
1,004,59
1,004,65
1,004,69
1,005,55
1,005,57
1,006,53
1,006,59
1,007,65
1,007,69
1,007,55
1,007,57
1,008,53
Want to create new text file that will be inserted by 'input', something like this
1,004,59,input
1,004,65,input
1,004,69,input
1,005,55,input
1,005,57,input
1,006,53,input
1,006,59,input
1,007,65,input
1,007,69,input
1,007,55,input
1,007,57,input
1,008,53,input
I have attempted something like this:
with open('data.txt', 'a') as f:
lines = f.readlines()
for i, line in enumerate(lines):
line[i] = line[i].strip() + 'input'
for line in lines:
f.writelines(line)
Not able to get the right approach though.
What you want is to be able to read and write to the file in place (at the same time). Python comes with the fileinput module which is good for this purpose:
import fileinput
for line in fileinput.input('data.txt', inplace=True):
line = line.rstrip()
print line + ",input"
Discusssion
The fileinput.input() function returns a generator that reads your file line by line. Each line ends up with a new line (either \n or \r\n, depends on the operating system).
The code then strip off each line of this new line, add the ",input" part, then print out. Note that because of fileinput magic, the print statement's output will go back into the file instead of the console.
There are a newline '\n' in every line in your file, so you should handle it.
edit: oh I forgot about the rstrip() function!
tmp = []
with open("input.txt", 'r') as file:
appendtext = ",input\n"
for line in file:
tmp.append(line.rstrip() + appendtext)
with open("input.txt", 'w') as file:
file.writelines(tmp)
Added:
Answer by Hai_Vu is great if you use fileinput since you don't have to open the file twice as I did.
To do only the thing you're asking I would go for something like
newLines = list()
with open('data.txt', 'r') as f:
lines = f.readlines()
for line in lines:
newLines.append(line.strip() + ',input\n')
with open('data2.txt', 'w') as f2:
f2.writelines(newLines)
But there are definitely more elegant solutions

Text file opening in python

Could someone give me some guidance on how you would get the contents of your text file on my python code without opening up the text file in another window?
Just point me in the right direction on how I should do it (No need for solutions)
with open(workfile, 'r') as f:
for line in f:
print line
If you don't use the context manager (the with statement) you will need to explicitly call f.close(), for example:
f = open('workfile', 'r')
line = f.readline()
print line
f.close()
file = open("your_file.txt", "r")
file.read()

Reading and writing to a file

I have an XML file that contains an illegal character, I am iterating through the file, removing the character from all of the lines and storing the lines in a list. I now want to write those same lines back into the file and overwrite what is already there.
I tried this:
file = open(filename, "r+")
#do stuff
Which is only appending the results to the end of the file, I would like to overwrite the existing file.
And this:
file = open(filename, "r")
#read from the file
file.close()
file = open(filename, "w")
#write to file
file.close()
This gives me a Bad File Descriptor error.
How can i read and write to the same file?
Thanks
You could re-write the lines list with writelines function.
with open(filename, "r") as f:
lines = f.readlines()
#edit lines here
with open(filename, "w") as f:
f.writelines(lines)
The reason you're appending to the end of the file the whole time is that you need to seek to the beginning of the file to write your lines out.
with open(filename, "r+") as file:
lines = file.readlines()
lines = [line.replace(bad_character, '') for line in lines]
file.seek(0)
file.writelines(lines)
file.truncate() # Will get rid of any excess characters left at the end of the file due to the length of your new file being shorter than the old one, as you've removed characters.
(Decided to just use the context manager syntax myself.)

Remove lines from a text file which do not contain a certain string with python

I am trying to form a quotes file of a specific user name in a log file. How do I remove every line that does not contain the specific user name in it? Or how do I write all the lines which contain this user name to a new file?
with open('input.txt', 'r') as rfp:
with open('output.txt', 'w') as wfp:
for line in rfp:
if ilikethis(line):
wfp.write(line)
with open(logfile) as f_in:
lines = [l for l in f_in if username in l]
with open(outfile, 'w') as f_out:
f_out.writelines(lines)
Or if you don't want to store all the lines in memory
with open(logfile) as f_in:
lines = (l for l in f_in if username in l)
with open(outfile, 'w') as f_out:
f_out.writelines(lines)
I sort of like the first one better but for a large file, it might drag.
Something along this line should suffice:
newfile = open(newfilename, 'w')
for line in file(filename, 'r'):
if name in line:
newfile.write(line)
newfile.close()
See : http://docs.python.org/tutorial/inputoutput.html#methods-of-file-objects
f.readlines() returns a list containing all the lines of data in the file.
An alternative approach to reading lines is to loop over the file object. This is memory efficient, fast, and leads to simpler code
>>> for line in f:
print line
Also you can checkout the use of with keyword. The advantage that the file is properly closed after its suite finishes
>>> with open(filename, 'r') as f:
... read_data = f.read()
>>> f.closed
True
I know you asked for python, but if you're on unix this is a job for grep.
grep name file
If you're not on unix, well... the answer above does the trick :)

Categories