I am trying to achieve:
User input word, and it outputs how many lines contain that word also sees it up to the first ten such lines. If no lines has the words, then your program must output Not found.
My code so far:
sentences = []
with open("txt.txt") as file:
for line in file:
words = line.split()
words_count += len(words)
if len(words) > len(maxlines.split()):
maxlines = line
sentences.append(line)
word = input("Enter word: ")
count = 0
for line in sentences:
if word in line:
print(line)
count += 1
print(count, "lines contain", word)
if count == 0:
print("Not found.")
How would I only print first 10 line regardless the amount of lines
Thank you!
If you want to iterate 10 times (old style, not pythonic at all)
index = 0
for line in file:
if index >= 10:
break
# do stuff 10 times
index += 1
Without using break, just put the stuff inside the condition. Notice that the loop will continue iterating, so this is not really a smart solution.
index = 0
for line in file:
if index < 10:
# do stuff 10 times
index += 1
However this is not pythonic at all. the way you should do it in python is using range.
for _ in range(10):
# do stuff 10 times
_ means you don't care about the index and just want to repeat 10 times.
If you want to iterate over file and keeping the index (line number) you can use enumerate
for lineNumber, line in enumerate(file):
if lineNumber >= 10:
break
# do stuff 10 times
Finally as#jrd1 suggested, you can actually read all the file and then only slice the part that you want, in your case
sentences[:10] # will give you the 10 first sentences (lines)
just change your code like this, it should help:
for line in sentences:
if word in line:
if count < 10: print(line) # <--------
count += 1
Related
Basically, this is my task. Extract numbers from a text file and then calculate the sum of them.
I wrote the code successfully and but it doesn't work fine with 2 or more digit numbers and negative numbers. What should i do?
f = open('file6.txt', 'r')
suma = 0
file = f.readlines()
for line in file:
for i in line:
if i.isdigit() == True:
suma += int(i)
print("The sum is ", suma)
file6.txt:
1
10
Output:
The sum is 2
In your case, you are going line by line first through the loop and looking at every digit ( in second loop ) to add.
And /n at the end of elements make the .isDigit() function disabled to find the digits.
So your updated code should be like this :
f = open('file6.txt', 'r')
suma = 0
file = f.readlines()
for line in file:
if line.strip().isdigit():
suma += int(line)
print("The sum is ", suma)
Hope it helps!
Use re.split to split the input into words on anything that is not part of a number. Try to convert the words into numbers, silently skip if this fails.
import re
sum_nums_in_file = 0
with open('file6.txt') as f:
for line in f:
for word in re.split(r'[^-+\dEe.]+', line):
try:
num = float(word)
sum_nums_in_file += num
except:
pass
print(f"The sum is {sum_nums_in_file}")
This works for example on files such as this:
-1 2.0e0
+3.0
Trying to count the total number of words in a for loop but sum() method isn't working and failed trying to use list-append method:
for line in open("jane_eyre.txt"):
strip = line.rstrip()
words = strip.split()
for i in words:
length = len(i)
if length > 10 and "e" not in i:
print(i)
#Can't find a way to make it work here..
sum(i)
The output of the words are:
possibility
drawingroom
disposition
drawingroom
introductory
accumulation
introductory
surrounding
continually
inflictions
*Couldn't find a way to make get "10" as the sum.
Thanks.
you can use this...
c = 0 #number of words
for line in open("jane_eyre.txt"):
strip = line.rstrip()
words = strip.split()
for i in words:
length = len(i)
if length > 10 and "e" not in i:
print(i)
#Can't find a way to make it work here..
c +=1 #increasing number of words
print(c)
sum() is not what you're looking for here. To understand sum() usage better, have a read of the documentation.
You can store a wordcount prior to the loop and increase the value every time you match your if statement.
words = ['hello', 'these', 'are', 'words', 'they', 'get', 'longer', 'indtroductory']
wordcount = 0
for word in words:
if len(word) > 10 and "e" not in word:
print(word)
wordcount += 1
#indtroductory
print(wordcount)
#1
If you need to access the words later, you could append them to a list and count the objects that exist within the list to obtain the count as well.
Try here:
count = 0
for line in open("jane_eyre.txt"):
strip = line.rstrip()
words = strip.split()
for i in words:
length = len(i)
if length > 10 and "e" not in i:
print(i)
#Can't find a way to make it work here..
count = count + 1
print(count)
I have a pretty long file in which I have to look for a particular value. The problem is that this file has two lines that starts the same way but I need to print the second one.
The file is something like:
... random text
Total = 910 K. #Don't need it
... more random lines
Total = 1000 K #The one I need it
I'm using:
for i,line in enumerate(lines):
if line.find('Total =') != -1:
Total = line.split()[4]
break
But this is only giving me the first match.
How can I skip the fist match and just use the second one?
Probably not the best solution but you could use a flag to check if you've already found the first occurance
is_second_occurance = False
for i,line in enumerate(lines):
if line.find('Total =') != -1:
if is_second_occurance:
total = line.split()[4]
break
else:
is_second_occurance = True
A better solution is probably to break it into a function that returns a generator
def get_total(lines):
for line in lines:
if line.startswith("Total = "):
yield line.split()[4]
total = get_total(lines)
total = get_total(lines)
I think that should give you the second occurance
Or may be regex
import re
string='''Total = 910 K. #Don't need it
... more random lines
Total = 1000 K #The one I need it'''
re.findall('(Total[\s\=\w]*K)',string)[1] #will find everything related to Total
Hello everyone i have an issue with this problem, the problem is i need to reset the count after every line in the file, i put a comment so you can see where i want to reset the count.
The program is suppose to cut each line after every specified lineLength.
def insert_newlines(string, afterEvery_char):
lines = []
for i in range(0, len(string), afterEvery_char):
lines.append(string[i:i+afterEvery_char])
string[:afterEvery_char] #i want to reset here to the beginning of every line to start count over
print('\n'.join(lines))
def main():
filename = input("Please enter the name of the file to be used: ")
openFile = open(filename, 'r')
file = openFile.read()
lineLength = int(input("enter a number between 10 & 20: "))
while (lineLength < 10) or (lineLength > 20) :
print("Invalid input, please try again...")
lineLength = int(input("enter a number between 10 & 20: "))
print("\nYour file contains the following text: \n" + file + "\n\n") # Prints original File to screen
print("Here is your output formated to a max of", lineLength, "characters per line: ")
insert_newlines(file, lineLength)
main()
Ex. If a file has 3 lines like this with each line having 20 chars
andhsytghfydhtbcndhg
andhsytghfydhtbcndhg
andhsytghfydhtbcndhg
after the lines are cut it should look like this
andhsytghfydhtb
cndhg
andhsytghfydhtb
cndhg
andhsytghfydhtb
cndhg
i want to RESET the count after every line in the file.
I'm not sure I understand your problem, but from your comments it appears you simply want to cut the input string (file) to lines lineLength long. That is already done in your insert_newlines(), no need for the line with comment there.
However, if you want to output lines meaning strings ending with newline char that should be no more than lineLength long, then you could simply read the file like this:
lines = []
while True:
line = openFile.readline(lineLength)
if not line:
break
if line[-1] != '\n':
line += '\n'
lines.append(line)
print(''.join(lines))
or alternatively:
lines = []
while True:
line = openFile.readline(lineLength)
if not line:
break
lines.append(line.rstrip('\n'))
print('\n'.join(lines))
I don't understand the issue here, the code seems to work just fine:
def insert_newlines(string, afterEvery_char):
lines = []
# if len(string) is 100 and afterEvery_char is 10
# then i will be equal to 0, 10, 20, ... 90
# in lines we'll have [string[0:10], ..., string[90:100]] (ie the entire string)
for i in range(0, len(string), afterEvery_char):
lines.append(string[i:i+afterEvery_char])
# resetting i here won't have any effect whatsoever
print('\n'.join(lines))
>>> insert_newlines('Beautiful is better than ugly.\nExplicit is better than implicit.\nSimple is better than complex.\n..', 10)
Beautiful
is better
than ugly.
Explicit
is better
than impli
cit.
Simpl
e is bette
r than com
plex.
..
isn't that what you want?
Well, I have a problem in a Python script, I need to do is that the index of the split function, increases automatically with every iteration of the loop. I do this:
tag = "\'"
while loop<=302:
for line in f1.readlines():
if tag in line:
word = line.split(tag)[num] #num is the index I need to increase
text = "Word: "+word+"."
f.write(text)
num = num + 1
loop = loop + 1
But...the "num" variable on index doesn't change...it simply stays the same. The num index indicates the word I need to take. So this is why "num = num + 1" would have to increase...
What is the problem in the loop?
Thanks!
Your question is confusing. But I think you want to move num = num + 1 into the for loop and if statement.
tag = "\'"
while loop<=302:
for line in f1.readlines():
if tag in line:
word = line.split(tag)[num] #num is the index I need to increase
num = num + 1
text = "Word: "+word+"."
f.write(text)
loop = loop + 1
Based on Benyi's comment in the question - do you just want this for the individual sentences? You might not need to index.
>>> mystring = 'hello i am a string'
>>> for word in mystring.split():
print 'Word: ',word
Word: hello
Word: i
Word: am
Word: a
Word: string
There seems to be a lot of things wrong with this.
First
while loop <= 302:
for line in f1.readlines():
f1.readlines() is going be [] for every iteration past the first
Second
for line in f1.readline():
word = line.split(tag)[num]
...
text = "Word: "+word+"."
Even if you made the for loop work, text will always be using the last iteration of the word. Maybe this is desired behavior, but it seems strange.
Third
while loop<=302:
...
loop = loop += 1
Seems like it would be better written as
for _ in xrange(302):
Since loop isn't used at all inside that scope. This is assuming loop starts at 0, if it doesn't then you just adjust 302 to however many iterations you wanted.
Lastly
num = num + 1
This is outside your inner loop, so num will always be the same for the first iteration, then won't matter latter because of the empty f1.readlines() as stated before.
I have a different approach to your problem as mentioned by you in the comment. Consider input.txt has the following entry:
this is a an input file.
then the Following code will give you the desired output
lines = []
with open (r'C:\temp\input.txt' , 'r') as fh:
lines = fh.read()
with open (r'C:\temp\outputfile.txt' , 'w') as fh1:
for words in lines.split():
fh1.write("Words:"+ words+ "\n" )