Python programming error re: reading from files

Python programming error re: reading from files - python

I'm taking an online class and we were assigned the following task:
"Write a program that prompts for a file name, then opens that file and reads through the file, looking for lines of the form:
X-DSPAM-Confidence: 0.8475
Count these lines and extract the floating point values from each of the lines and compute the average of those values and produce an output as shown below.
You can download the sample data at http://www.pythonlearn.com/code/mbox-short.txt when you are testing below enter mbox-short.txt as the file name."
The desired output is: "Average spam confidence: 0.750718518519"
Here is the code I've written:
fname = raw_input("Enter file name: ")
fh = open(fname)
inp = fh.read()
for line in inp:
if not line.strip().startswith("X-DSPAM-Confidence: 0.8475") : continue
pos = line.find(':')
num = float(line[pos+1:])
total = float(num)
count = float(total + 1)
print 'Average spam confidence: ', float( total / count )
The output I get is: "Average spam confidence: nan"
What am I missing?

values = []
#fname = raw_input("Enter file name: ")
fname = "mbox-short.txt"
with open(fname, 'r') as fh:
for line in fh.read().split('\n'): #creating a list of lines
if line.startswith('X-DSPAM-Confidence:'):
values.append(line.replace('X-DSPAM-Confidence: ', '')) # I don't know whats after the float value
values = [float(i) for i in values] # need to convert the string to floats
print 'Average spam confidence: %f' % float( sum(values) / len(values))
I just tested this against the sample data it works just fine

#try the code below, it is working.
fname = raw_input("Enter file name: ")
count=0
value = 0
sum=0
fh = open(fname)
for line in fh:
if not line.startswith("X-DSPAM-Confidence:") : continue
pos = line.find(':')
num = float(line[pos+1:])
sum=sum+num
count = count+1
print "Average spam confidence:", sum/count

My guess from the question is that the actual 0.8475 is actually just an example, and you should be finding all the X-DSPAM-Confidence: lines and reading those numbers.
Also, the indenting on the code you added has all the calcuations outside the for loop, I'm hoping that is just a formatting error for the upload, otherwise that would also be a problem.
As a matter if simplification you can also skip the
inp = fh.read()
line and just do
for line in fh:
Another thing to look at is that total will always only be the last number you read.

# Use the file name mbox-short.txt as the file name
fname = raw_input("Enter file name: ")
fh = open(fname)
count = 0
total = 0
for line in fh:
if not line.startswith("X-DSPAM-Confidence:") : continue
count = count + 1
# print count
num = float(line[20:])
total +=num
# print total
average = total/count
print "Average spam confidence:", average

The way you're checking if it is the correct field is too specific. You need to look for the field title without a value (see code below). Also your counting and totaling needs to happen within the loop. Here is a simpler solution that makes use of python's built in functions. Using a list like this takes a little bit more space but makes the code easier to read in my opinion.
How about this? :D
with open(raw_input("Enter file name: ")) as f:
values = [float(line.split(":")[1]) for line in f.readlines() if line.strip().startswith("X-DSPAM-Confidence")]
print 'Average spam confidence: %f' % (sum(values)/len(values))
My output:
Average spam confidence: 0.750719
If you need more precision on that float: Convert floating point number to certain precision, then copy to String
Edit: Since you're new to python that may be a little too pythonic :P Here is the same code expanded out a little bit:
fname = raw_input("Enter file name: ")
values = []
with open(fname) as f:
for line in f.readlines():
if line.strip().startswith("X-DSPAM-Confidence"):
values.append(float(line.split(":")[1]))
print 'Average spam confidence: %f' % (sum(values)/len(values))

fname = raw_input("Enter file name: ")
fh = open(fname)
x_count = 0
total_count = 0
for line in fh:
if not line.startswith("X-DSPAM-Confidence:") : continue
line = line.strip()
x_count = x_count + 1
num = float(line[21:])
total_count = num + total_count
aver = total_count / x_count
print "average spam confidence:", aver

user_data = raw_input("Enter the file name: ")
lines_list = [line.strip("\n") for line in open(user_data, 'r')]
def find_spam_confidence(data):
confidence_sum = 0
confidence_count = 0
for line in lines_list:
if line.find("X-DSPAM-Confidence") == -1:
pass
else:
confidence_index = line.find(" ") + 1
confidence = float(line[confidence_index:])
confidence_sum += confidence
confidence_count += 1
print "Average spam confidence:", str(confidence_sum / confidence_count)
find_spam_confidence(lines_list)

fname = raw_input("Enter file name: ")
fh = open(fname)
c = 0
t = 0
for line in fh:
if line.startswith("X-DSPAM-Confidence:") :
c = c + 1
p = line.find(':')
n = float(line[p+1:])
t = t + n
print "Average spam confidence:", t/c

fname = input("Enter file name: ")
fh = open(fname)
count = 0
add = 0
for line in fh:
if line.startswith("X-DSPAM-Confidence:"):
count = count+1
pos = float(line[20:])
add = add+pos
print("Average spam confidence:", sum/count)

fname = input('Enter the file name : ') # file name is mbox-short.txt
try:
fopen = open(fname,'r') # open the file to read through it
except:
print('Wrong file name') #if user input wrong file name display 'Wrong file name'
quit()
count = 0 # variable for number of 'X-DSPAM-Confidence:' lines
total = 0 # variable for the sum of the floating numbers
for line in fopen: # start the loop to go through file line by line
if line.startswith('X-DSPAM-Confidence:'): # check whether a line starts with 'X-DSPAM-Confidence:'
count = count + 1 # counting total no of lines starts with 'X-DSPAM-Confidence:'
strip = line.strip() # remove whitespace between selected lines
nline = strip.find(':') #find out where is ':' in selected line
wstring = strip[nline+2:] # extract the string decimal value
fstring = float(wstring) # convert decimal value to float
total = total + fstring # add the whole float values and put sum in to variable named 'total'
print('Average spam confidence:',total/count) # printout the average value

total = float(num)
You forgot here to sum the num floats.
It should have been
total = total+num

fname = input("Enter file name: ")
fh = open(fname)
count=0
avg=0
cal=0
for line in fh:
if not line.startswith("X-DSPAM-Confidence:") :
continue
else:
count=count+1
pos = line.find(':')
num=float(line[pos+1:])
cal=float(cal+num)
#print cal,count
avg=float(cal/count)
print ("Average spam confidence:",avg)

IT WORKS JUST FINE !!!
Use the file name mbox-short.txt as the file name
fname = raw_input("Enter file name: ")
if len(fname) == 0:
fname = 'mbox-short.txt'
fh = open(fname)
count = 0
tot = 0
ans = 0
for line in fh:
if not line.startswith("X-DSPAM-Confidence:") : continue
count = count + 1
num = float(line[21:])
tot = num + tot
ans = tot / count
print("Average spam confidence:", ans)

# Use the file name mbox-short.txt as the file name
fname = raw_input("Enter file name: ")
fh = open(fname,'r')
count=0
avg=0.0
cal=0.00
for line in fh:
if not line.startswith("X-DSPAM-Confidence:") :
continue
else:
count=count+1
pos = line.find(':')
num=float(line[pos+1:])
cal=cal+num
#print cal,count
avg=float(cal/count)
print "Average spam confidence:",avg

fname = raw_input("Enter file name: ")
fh = open(fname)
inp = fh.read()
i = 0
total = 0
count = 0
for line in inp:
if not line.strip().startswith("X-DSPAM-Confidence: 0.8475"):
continue
pos = line.find(':')
num = float(line[pos+1:])
total += num
count += 1
print 'Average spam confidence: ', float( total / count )

Related

My code cannot find my file, and I am not sure what is wrong with it

def get_file():
file_name = input("Enter the name of the file: ")
try:
count = 0
total = 0.0
average = 0.0
maximum = 0
minimum = 0
range1 = 0
with open(file_name) as file:
number = int(line)
count = count + 1
total = total+ num
maximum = max(number)
minimum = min(number)
average = total/count
range = maximum = minimum
print('The name of the file: ', file_name)
print('The sum of the numbers: ', total)
print('The count of how many numbers are in the file: ', count)
print('The average of the numbers: ', average)
print('The maximum value: ', maximum)
print('The minimum value: ', minimum)
print('The range of the values (maximum - minimum): ', range)
except:
print("The file is not found.")
def main():
get_file()
main()
That is my code, I keep getting the error that the file is not found. I have made sure that the text files that I am inputing into this code is in the same file and that I am spelling everything right. What is wrong

well you are never doing anything with lines of file, and you need to make sure you are passing the right file path syntax to the function \ code. I.E.
def get_file(file_name):
""" do something with file. """
print(file_name, 'what path or file was sent to function?')
try:
with open(file_name) as file:
for line in file:
if line:
print(line)
except FileNotFoundError:
print("Wrong file or file path")
except OSError as err:
print("OS error: {0}".format(err))
print()
def main():
file_one = '\\192.168.1.249\DataDrive_02TB\Tools\net_scanner\hi.txt'
file_three = 'c:\temp\hi.txt'
file_two = '//192.168.1.249/DataDrive_02TB/Tools/net_scanner/hi.txt'
get_file(file_one)
get_file(file_three)
get_file(file_two)
file_input = input("Enter the name of the file: ")
get_file(file_input)
main()

transform for in loop to while loop

i have this assignment in a basic programming course where i need to transform this code using while loop instead of for loop, but i dont know how to doit
this is my code so far
def read_txt(file_txt):
file = open(file_txt, "r")
lines = file.readlines()
file.close()
return lines
file_txt = input("file: ")
lines = read_txt(file_txt)
for l in lines:
asd = l.split(",")
length = len(asd)
score = 0
for i in range(1, length):
score += int(asd[i])
average = score / (length-1)
print(asd[0], average)
file text is like this
edward,4,3,1,2
sara,5,4,1,0

def read_txt(file_txt):
file = open(file_txt, "r")
lines = file.readlines()
file.close()
return lines
file_txt = input("file: ")
lines = read_txt(file_txt)
lines.reverse()
while lines:
l = lines.pop()
asd = l.split(",")
length = len(asd)
score = 0
i = 1
while i < length:
score += int(asd[i])
i += 1
average = score / (length-1)
print(asd[0], average)
Now in this while loop, it will iterate through lines until lines is empty. it will pop out items one by one.

For loops are more suitable for iterating over lines in files than while loops. Few improvements here are, (1) use the builtin sum instead of manually adding up scores, and (2) don't read all lines in file at once if the files are too big.
file_txt = input("file: ")
with open(file_txt) as f:
while True:
line = f.readline()
if not line:
break
name, scores = line.split(',', maxsplit=1)
scores = scores.split(',')
avg = sum(int(s) for s in scores) / len(scores)
print(f'{name} {avg}')
As you see above the check for if not line to determine if we have reached the end of file in a while loop, this is not needed in for loop as that implements the __iter__ protocol.
Python 3.8 walrus operator makes that slightly easier with::
file_txt = input("file: ")
with open(file_txt) as f:
while line := f.readline():
name, scores = line.split(',', maxsplit=1)
scores = scores.split(',')
avg = sum(int(s) for s in scores) / len(scores)
print(f'{name} {avg}')

The following gives the exact same output without using any for loop.
filename = input("file: ")
with open(filename) as f:
f = f.readlines()
n = []
while f:
v = f.pop()
if v[-1] == '\n':
n.append(v.strip('\n'))
else:
n.append(v)
d = {}
while n:
v = n.pop()
v = v.split(',')
d[v[0]] = v[1:]
d_k = list(d.keys())
d_k.sort(reverse=True)
while d_k:
v = d_k.pop()
p = d[v]
n = []
while p:
a = p.pop()
a = int(a)
n.append(a)
print(str(v), str(sum(n)/len(n)))
Output:
edward 2.5
sara 2.5

Python incorrect indent for for-loop (Coursera Python Data Structure courses)

Why does my code print the empty list?
fname = input("Enter file name: ")
if len(fname) < 1 : fname = "mbox-short.txt"
fh = open(fname)
count = 0
lst = []
for line in fh:
line = line.rstrip()
word = line.split()
if len(word) < 0:
countinue
print(word[1])
the text file can be downloaded here

There are two issues that I found:
Be careful with the indent, which results the empty print
The space is matter to find the correct begging place, I accidentally missed the space so that produce the duplicated or repeated result.
#Assignment 8.5
#file name = mbox-short.txt
fname = input("Enter file name: ")
if len(fname) < 1 : fname = "mbox-short.txt"
fh = open(fname)
count = 0
for line in fh:
line = line.rstrip()
if not line.startswith('From '): #To check if the line staty with 'From '
continue #Note that there is a space behind the From, otherwise the print resuly would duplicated
word = line.split()
count = count + 1
print(word[1]) #be careful with the indent
print("There were", count, "lines in the file with From as the first word")

counting the lines and extract the floating point values and compute the average of the values

So i need to write a program that prompts for a file name, then opens that file and reads through the file, looking for lines of the form:X-DSPAM-Confidence: 0.8475
I am stuck in getting the sum of the extracted values and counting the lines and printing to show the user.
out_number = 'X-DSPAM-Confidence: 0.8475'
Num = 0.0
flag = 0
fileList = list()
fname = input('Enter the file name')
try:
fhand = open(fname)
except:
print('file cannot be opened:',fname)
for line in fhand:
fileList = line.split()
print(fileList)
for line in fileList:
if flag == 0:
pos = out_number.find(':')
Num = out_number[pos + 2:]
print (float(Num))

You have an example line in your code, and when you look through each line in your file, you compute the number in your example line, not in the line from the file.
So, here's what I would do:
import os
import sys
fname = input('Enter the file name: ')
if not os.path.isfile(fname):
print('file cannot be opened:', fname)
sys.exit(1)
prefix = 'X-DSPAM-Confidence: '
numbers = []
with open(fname) as infile:
for line in infile:
if not line.startswith(prefix): continue
num = float(line.split(":",1)[1])
print("found:", num)
numbers.append(num)
# now, `numbers` contains all the floating point numbers from the file
average = sum(numbers)/len(numbers)
But we can make it more efficient:
import os
import sys
fname = input('Enter the file name: ')
if not os.path.isfile(fname):
print('file cannot be opened:', fname)
sys.exit(1)
prefix = 'X-DSPAM-Confidence: '
tot = 0
count = 0
with open(fname) as infile:
for line in infile:
if not line.startswith(prefix): continue
num = line.split(":",1)[1]
tot += num
count += 1
print("The average is:", tot/count)

try this
import re
pattern = re.compile("X-DSPAM-Confidence:\s(\d+.\d+)")
sum = 0.0
count = 0
fPath = input("file path: ")
with open('fPath', 'r') as f:
for line in f:
match = pattern.match(line)
if match is not None:
lineValue = match.group(1)
sum += float(lineValue)
count += 1
print ("The average is:", sum /count)

fname = input("Enter file name: ")
fh = open(fname)
count=0
x=0
for line in fh:
if not line.startswith("X-DSPAM-Confidence:") : continue
x=float(line.split(":")[1].rstrip())+x
count=count+1
output=x/count
print("Average spam confidence:",output)

Using the split() or find() function in python

Am writing a program that opens a file and looks for line which are like this:
X-DSPAM-Confidence: 0.8475.
I want to use the split and find function to extract these lines and put it in a variable. This is the code I have written:
fname = raw_input("Enter file name: ")
if len(fname) == 0:
fname = 'mbox-short.txt'
fh = open(fname,'r')
total = 0
for line in fh:
if not line.startswith("X-DSPAM-Confidence:"): continue
Please, Please I am now beginning in python so please give me something simple which I can understand to help me later on. Please, Please.

I think the only wrong part is not in if :
fname = raw_input("Enter file name: ")
if len(fname) == 0:
fname = 'mbox-short.txt'
fh = open(fname,'r')
total = 0
lines = []
for line in fh:
if line.startswith("X-DSPAM-Confidence:"):
lines.append(line)

First receive the input with raw_input()
fname = raw_input("Enter file name: ")
Then check if the input string is empty:
if not fname:
fname = 'mbox-short.txt'
Then, open the file and read it line by line:
lines = []
with open(fname, 'r') as f:
for line in f.readlines():
if line.startswith("X-DSPAM-Confidence:"):
lines.append(line)
The with open() as file statement just ensures that the file object gets closed when you don't need it anymore. (file.close() is called automatically upon exiting out of the with clause)

I know where this one is coming from as I've done it myself some time ago. As far as I remember you need to calculate the average :)
fname = raw_input("Enter file name: ")
fh = open(fname)
count = 0
sum = 0
for line in fh:
if not line.startswith("X-DSPAM-Confidence:") : continue
count = count + 1
pos = line.find(' ')
sum = sum + float(line[pos:])
average = sum/count

You're very close, you just need to add a statement below the continue adding the line to a list.
fname = raw_input("Enter file name: ")
if len(fname) == 0:
fname = 'mbox-short.txt'
fh = open(fname,'r')
total = 0
lines = []
for line in fh:
if not line.startswith("X-DSPAM-Confidence:"):
continue
lines.append(line) # will only execute if the continue is not executed
fh.close()
You should also look at the with keyword for opening files - it's much safer and easier. You would use it like this (I also swapped the logic of your if - saves you a line and a needless continue):
fname = raw_input("Enter file name: ")
if len(fname) == 0:
fname = 'mbox-short.txt'
total = 0
good_lines = []
with open(fname,'r') as fh:
for line in fh:
if line.startswith("X-DSPAM-Confidence:"):
good_lines.append(line)
If you just want the values, you can do a list comprehension with the good_lines list like this:
values = [ l.split()[1] for l in good_lines ]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python programming error re: reading from files - python

fname = raw_input("Enter file name: ") fh = open(fname) c = 0 t = 0 for line in fh: if line.startswith("X-DSPAM-Confidence:") : c = c + 1 p = line.find(':') n = float(line[p+1:]) t = t + n print "Average spam confidence:", t/c

fname = input("Enter file name: ") fh = open(fname) count = 0 add = 0 for line in fh: if line.startswith("X-DSPAM-Confidence:"): count = count+1 pos = float(line[20:]) add = add+pos print("Average spam confidence:", sum/count)

total = float(num) You forgot here to sum the num floats. It should have been total = total+num

Related

My code cannot find my file, and I am not sure what is wrong with it

transform for in loop to while loop

Python incorrect indent for for-loop (Coursera Python Data Structure courses)

counting the lines and extract the floating point values and compute the average of the values

Using the split() or find() function in python

Categories

Resources