I'm pretty sure I'm over thinking this and there's a simple outcome for it, but I just can't seem to put it all together.
I'm looking for a kind of a search method. I'd like a Python script search a text file for a keyword and count how many lines it appears on. Though if the keyword comes up on a single line multiple times, I'd like to still only count it once.
Long story short; If a keyboard comes up on a single line, I want it to count that line and add it to what will be a math equation.
Any help is greatly appreciated! Thanks in advance.
You can define the following function.
def lcount(keyword, fname):
with open(fname, 'r') as fin:
return sum([1 for line in fin if keyword in line])
Now if you want to know the number of lines containing "int" in "foo.cpp", you do:
print lcount('int', 'foo.cpp')
An alternative that you can do on the command line (if you are on an appropriate platform) is:
grep int foo.cpp | wc -l
A non-Python Unix solution is fairly immediate:
"search a text file for a keyword" is a grep
"count how many lines" is a wc
Do you have difficulty implementing the essence of either of these in Python?
Assuming f is the file object,
lines = f.readlines()
print len([line for line in lines if keyword in line])
Perhaps you could try this:
def kwdCount(textContent, keyword):
lines=textContent.split("\n")
count=len([1 for line in lines if line.find(keyword)!=-1])
return count
>>> yourTextFile="hello world\n some words here\n goodbye world"
>>> kwdCount(ourTextFile,"world")
2
Related
I am still learning python and have a question about the function readlines() The following is a part of my script:
f = open("demofile.txt", "r")
text = "".join(f.readlines())
print(text)
demofile.txt contains:
This is the first line
This is the second line
This is the third line
Now I want to add a single word to this so I get:
This is the first line
This is the second line
This is the third line
Example
I thought of something easy way of doing it:
f = open("demofile.txt", "r")
text = "".join(f.readlines())."Example"
print(text)
But that doesn't work (of course) I googled and looked around here but didn't really have the good keywords to search for this issue. Hopefully someone can point me in the right direction.
.readlines() returns list you can append() to it:
with open("demofile.txt") as txt:
lines = txt.readlines()
lines.append("Example")
text = "".join(lines)
print(text)
or you can unpack the file object txt, since its an iterator to a new list with the word you wanted to add:
with open("demofile.txt") as txt:
text = "".join([*txt, "Example"])
print(text)
Firstly, the open function in python opens a file in read mode by default. Thus, you do not need to specify the mode r when opening the file. Secondly, you should always close a file after you are done with it. A with statement in python handles this for you. Moreover, instead of using . to add Example onto the end of the string, you should use the concatenation operator in python to add a newline character, \n, and the string, Example.
with open("demofile.txt") as f:
text = "".join(f.readlines()) + "\nExample"
print(text)
This should help you. While dealing with files. It is always recommended to use with open('filename','r') as f instead of f=open('filename','r'). Using ContextManager during file open is the idea that this file will be open in any case whether everything is ok or any exception is raised. And you don't need to explicitly close the file i.e f.close().
end_text='Example'
with open('test.txt','r') as f:
text=''.join(f.readlines())+'\n'+end_text
print(text)
The exact question to this problem is:
*Create a file with a 20 lines of text and name it “lines.txt”. Write a program to read this a file “lines.txt” and write the text to a new file, “numbered_lines.txt”, that will also have line numbers at the beginning of each line.
Example:
Input file: “lines.txt”
Line one
Line two
Expected output file:
1 Line one
2 Line two
I am stuck, and this is what I have so far. I am a true beginner to Python and my instructor does not make things very clear. Critique and help much appreciated.
file_object=open("lines.txt",'r')
for ln in file_object:
print(ln)
count=1
file_input=open("numbered_lines.txt",'w')
for Line in file_object:
print(count,' Line',(str))
count=+1
file_object.close
file_input.close
All I get for output is the .txt file I created stating lines 1-20. I am very stuck and honestly have very little idea about what I am doing. Thank you
You have all the right parts, and you're almost there:
When you do
for ln in file_object:
print(ln)
you've exhausted the contents of that file, and you won't be able to read them again, like you try to do later on.
Also, print does not write to a file, you want file_input.write(...)
This should fix all of that:
infile = open("lines.txt", 'r')
outfile = open("numbered_lines.txt", 'w')
line_number = 1
for line in infile:
outfile.write(str(line_number) + " " + line)
infile.close()
outfile.close()
However, here is a more pythonic way to do it:
with open("lines.txt") as infile, open("numbered_lines.txt", 'w') as outfile:
for i, line in enumerate(infile, 1):
outfile.write("{} {}".format(i, line))
Good first try, and with that, I can go through your code and explain what you did right (or wrong)
file_object=open("lines.txt",'r')
for ln in file_object:
print(ln)
This is fine, though generally you want to put a space before and after assignments (you are assigning the results of open to file_object) and add a space after a,` when separating arguments, so you might want to write that like so:
file_object = open("lines.txt", 'r')
for ln in file_object:
print(ln)
However, at this point the internal reference in the file_object have reached the end of the file, so if you wish to reuse the same object, you need to seek back to the beginning position, which is 0. As your assignment only states write to the file (and not on the screen), the above loop should be omitted from the file (but I get what you want to do, you want to see the contents of the file immediately though sometimes instructors are pretty strict on what they accept). Moving on:
count=1
file_input=open("numbered_lines.txt",'w')
for Line in file_object:
Looks pretty normal so far, again, minor formatting issues. In Python, typically we name all variables lower-case, as names with Capitalization are generally reserved for class names (if you wish to, you may read about them). Now we enter into the loop you got
print(count,' Line',(str))
This prints not quite what you want. as ' Line' is enclosed inside a quote, it is treated as a string literal - so it's treated literally as text and not code. Given that you had assigned Line, you want to take out the quotes. The (str) at the end simply just print out the string object and it definitely is not what you want. Also, you forgot to specify the file you want to print to. By default it will print to the screen, but you want to print it to the the numbered_lines.txt file which you had opened and assigned to file_input. We will correct this later.
count=+1
If you format this differently, you are assigning +1 to count. I am guessing you wanted to use the += operator to increment it. Remember this on your quiz/tests.
Finally:
file_object.close
file_input.close
They are meant to be called as functions, you need to invoke them by adding parentheses at the end with arguments, but as close takes no arguments, there will be nothing inside the parentheses. Putting everything together, the complete corrected code for your program should look like this
file_object = open("lines.txt", 'r')
count = 1
file_input = open("numbered_lines.txt", 'w')
for line in file_object:
print(count, line, file=file_input)
count += 1
file_object.close()
file_input.close()
Run the program. You will notice that there is an extra empty line between every line of text. This is because by default the print function adds a new line end character; the line you got from the file included a new-line character at the end (that's what make them lines, right?) so we don't have to add our own here. You can of course change it to an empty string. That line will look like this.
print(count, line, file=file_input, end='')
Naturally, other Python programmers will tell you that there are Pythonic ways, but you are just starting out, don't worry too much about them (although you can definitely pick up on this later and I highly encourage you to!)
The right way to open a file is using a with statement:
with open("lines.txt",'r') as file_object:
... # do something
That way, the context manager introduced by with will close your file at the end of "something " or in case of exception.
Of course, you can close the file yourself if you are not familiar with that. Not that close is a method: to call it you need parenthesis:
file_object.close()
See the chapter 7.2. Reading and Writing Files, in the official documentation.
In the first loop you're printing the contents of the input file. This means that the file contents have already been consumed when you get to the second loop. (Plus the assignment didn't ask you to print the file contents.)
In the second loop you're using print() instead of writing to a file. Try file_input.write(str(count) + " " + Line) (And file_input seems like a bad name for a file that you will be writing to.)
count=+1 sets count to +1, i.e. positive one. I think you meant count += 1 instead.
At the end of the program you're calling .close instead of .close(). The parentheses are important!
So im trying to find a way so I can read a txt file and find a specific word. I have been calling the file with
myfile=open('daily.txt','r')
r=myfile.readlines()
that would return a list with a string for each line in the file, i want to find a word in one of the strings inside the list.
edit:
Im sorry I meant if there was a way to find where the word is in the txt file, like x=myfile[12] x=x[2:6]
def findLines():
myWord = 'someWordIWantToSearchFor'
answer = []
with open('daily.txt') as myfile:
lines = myfile.readlines()
for line in lines:
if myWord in line:
answer.append(line)
return answer
with open('daily.txt') as myfile:
for line in myfile:
if "needle" in line:
print "found it:", line
With the above, you don't need to allocate memory for the entire file at once, only one line at a time. This will be much more efficient if your file is large. It also closes the file automatically at the end of the with.
I'm not sure if the suggested answers solve the problem or not, because I'm not sure what the original proposer means. If he really means "words," not "substrings" then the solutions don't work, because, for example,
'cat' in line
evaluates to True if line contains the word 'catastrophe.' I think you may want to amend these answers along the lines of
if word in line.split(): ...
I'm new to Python and am working on a program that will count the instances of words in a simple text file. The program and the text file will be read from the command line, so I have included into my programming syntax for checking command line arguments. The code is below
import sys
count={}
with open(sys.argv[1],'r') as f:
for line in f:
for word in line.split():
if word not in count:
count[word] = 1
else:
count[word] += 1
print(word,count[word])
file.close()
count is a dictionary to store the words and the number of times they occur. I want to be able to print out each word and the number of times it occurs, starting from most occurrences to least occurrences.
I'd like to know if I'm on the right track, and if I'm using sys properly. Thank you!!
What you did looks fine to me, one could also use collections.Counter (assuming you are python 2.7 or newer) to get a bit more information like the number of each word. My solution would look like this, probably some improvement possible.
import sys
from collections import Counter
lines = open(sys.argv[1], 'r').readlines()
c = Counter()
for line in lines:
for work in line.strip().split():
c.update(work)
for ind in c:
print ind, c[ind]
Your final print doesn't have a loop, so it will just print the count for the last word you read, which still remains as the value of word.
Also, with a with context manager, you don't need to close() the file handle.
Finally, as pointed out in a comment, you'll want to remove the final newline from each line before you split.
For a simple program like this, it's probably not worth the trouble, but you might want to look at defaultdict from Collections to avoid the special case for initializing a new key in the dictionary.
I just noticed a typo: you open the file as f but you close it as file. As tripleee said, you shouldn't close files that you open in a with statement. Also, it's bad practice to use the names of builtin functions, like file or list, for your own identifiers. Sometimes it works, but sometimes it causes nasty bugs. And it's confusing for people who read your code; a syntax highlighting editor can help avoid this little problem.
To print the data in your count dict in descending order of count you can do something like this:
items = count.items()
items.sort(key=lambda (k,v): v, reverse=True)
print '\n'.join('%s: %d' % (k, v) for k,v in items)
See the Python Library Reference for more details on the list.sort() method and other handy dict methods.
I just did this by using re library. This was for average words in a text file per line but you have to find out number of words per line.
import re
#this program get the average number of words per line
def main():
try:
#get name of file
filename=input('Enter a filename:')
#open the file
infile=open(filename,'r')
#read file contents
contents=infile.read()
line = len(re.findall(r'\n', contents))
count = len(re.findall(r'\w+', contents))
average = count // line
#display fie contents
print(contents)
print('there is an average of', average, 'words per sentence')
#closse the file
infile.close()
except IOError:
print('An error oocurred when trying to read ')
print('the file',filename )
#call main
main()
I have a text file need to search it may be uppercase or lowercase letters in file using python
Maybe you should spent more time writing the question, if you expect us to invest time in answering it. Nevertheless, from what I understand you are looking for something like this:
import sys
with open(sys.argv[0], "r") as f:
for row in f:
for chr in row:
if chr.isupper():
print chr, "uppercase"
else:
print chr, "lowercase"