I have written a code that extracts floating point numbers from a
text file and produces a list of the numbers.
My challenge is summing the consecutive numbers and finding the
average of the numbers.
I am not allowed to use the sum function and I am new to python ..
this the code I have written so far ,
what can I do to add through the list
fh = open(fname)
for line in fh:
if line.startswith("X-DSPAM-Confidence:") : continue
# print(line)
count = 0
for line in fh:
if line.startswith("X-DSPAM-Confidence:"):
count = count + 1
# print(count)
for line in fh:
if line.startswith("X-DSPAM-Confidence:"):
# print(line)
xpos = line.find(' ')
# print(xpos)
num = line[xpos : ]
# print(float(num))
fnum = float(num)
# print(fnum)
total = 0
for i in fnum:
total += int(i)
print(total)
Error:"float object not iterable on line 24" ... line 24 is the 4th for loop
First an open file is iterable only once, and your code shows 4 loops starting with for line in fh:. After first loop, the file pointer will reach the end of file, and the following loops should immediately return. For that reason with should be prefered.
Next somewhere in the loop you get a float value in fnum. Just initialize total before starting the loop, and add fnum when you get it:
total = 0
with open(fname) as fh:
for line in fh:
if line.startswith("X-DSPAM-Confidence:"):
# print(line)
xpos = line.find(' ')
# print(xpos)
num = line[xpos : ]
# print(float(num))
fnum = float(num)
# print(fnum)
total += fnum
# print(total)
with ensures that the file will be cleanly closed at the end of the loop.
fnum is a float. It's not an array, therefore it's not iterable and cannot be iterated in a for loop.
You probably don't need an array to determine the total and the average:
fname = "c:\\mbox-short.txt"
fh = open(fname)
count = 0
total = 0
for line in fh:
if line.startswith("X-DSPAM-Confidence:"):
xpos = line.find(' ')
num = line[xpos : ]
fnum = float(num)
total += fnum
count += 1
print("Total = " + str(total))
print("Average = " + str(total / count))
print("Number of items = " + str(count))
You don't have to use startsWith in this case. Better to use split for each line of the file. Each line will split all the words to a list. Then using the indexes you look for, X-DSPAM-Confidence:. If it exists then take the corresponding value of interest. In this case it is index number 1. Below is the code:
total = 0
number_of_items = 0
with open("dat.txt", 'r') as f:
for line in f:
fields = line.split()
if fields != []:
if fields[0] == "X-DSPAM-Confidence:":
number_of_items += 1
total += float(fields[1])
print(total)
print(number_of_items)
avg = (total/number_of_items)
print(avg)
I saved your data in a text file names, "dat.txt".
Hope it helps !!!
Related
So I'm trying to calculate the sum and average of a text document filled with 10000 numbers.ยจ
This is my code:
with open("\\Users\\saksa\\python_courses\\1DV501\\assign3\\file_10000integers_A.txt", "r") as f:
total = 0
number_of_ints = 0
for line in f:
for i in line:
if i.isdigit() == True:
total += int(i)
number_of_ints +=1
print (total)
print (number_of_ints)
The document is formated like: 215, 631, 731, 225, 315, etc in multiple lines
The problem is that it reads every number 1 by 1. So 100 becomes 1 + 0 + 0.
I think I need to use split to make it work but I cant figure out how to.
You can iterate over each line in a file.
And you can split each line by a ',' and add each number
with open("\\Users\\saksa\\python_courses\\1DV501\\assign3\\file_10000integers_A.txt", "r") as f:
total = 0
number_of_ints = 0
for line in f:
print(line)
for num in line.split(','):
print(num)
total += int(num)
number_of_ints += 1
print(total)
print(number_of_ints)
You will need to add some logic to ensure the numbers are numbers
You should use split() function:
line = "110, 23400, 34569, 23567"
line.replace(" ", "") # Get rid of unnecessary spacebars
total = 0
number_of_ints = 0
for i in line.split(","):
# Here you can do whatever you want
try:
i = int(i)
total += int(i)
number_of_ints += 1
except:
pass
The file looks like this:
1, a b
2, c d
3, e f
my current code
b = open('file.txt', 'r')
c = b.readlines()
regels = len(c)
print(regels)
I got the numbers of lines but still need biggest number + on which line it is.
So you are just looking to find the biggest number in the first column of the file? This should help
b = open('file.txt', 'r')
c = b.readlines()
regels = len(c)
print(regels)
max = 0
for line in b.readlines():
num = int(line.split(",")[0])
if (max < num):
max = num
print(max)
# Close file
b.close()
This is how I'd go about doing it.
max_num = 0
with open('file.txt', 'r') as data: # use the with context so that the file closes gracefully
for line in data.readlines(): # read the lines as a generator to be nice to my memory
try:
val = int(line.split(",")[0])
except ValueError: # just incase the text file is not formatted like your example
val = 0
if val > max_num: # logic
max_num = val
print max_num #result
You need loop over each line in file, parse each line and find the largest number.
I do not quite understand how the numbers are stored in your file. Just assuming that in each line, the first field are numeric and separate with others (non-numeric) by ','. And I assume all numbers are integer.
ln = 0
maxln = 0
maxn = 0
with open(filename, 'r') as f:
line = f.next()
if line:
ln = 1
maxln = 1
maxn = int(line.split(",")[0].strip())
else:
raise Exception('Empty content')
for line in f:
ln += 1
cur = int(line.split(",")[0].strip())
if cur > maxn:
maxn = cur
maxln = ln
ln is used to record current line number, maxn is used to record current maximum number, and maxln is used to record current maximum number location.
One thing you need to do is fetch the first line to initialize these variables.
None of the answers give you the line of the max number so I'll post some quick code and refine later
max_num = 0
line_count = 0
with open('file.txt', 'r') as infile:
for line in infile:
number = int(line.split(',')[0])
if number > max_num:
max_num = number
line_num = line_count
line_count += 1
print (max_num)
print (line_num)
Read line
Split it on basis of comma
Append first element to temp list.
Once complete reading of file is done,
To get maximum number, just use max function on temp list.
Since file is read line by line sequentially and appending number from line to temp list, to get line number on which maximum number is present, just find the index of max number in temp list and increment it by one since list index starts with zero.
P.S : Check last three print statements
Code:
num_list = []
with open('master.csv','r')as fh:
for line in fh.readlines():
num_list.append(int((line.split(','))[0]))
print num_list
print "Max number is -" ,max(num_list)
print "Line number is - ", (num_list.index(max(num_list)))+1
Output:
C:\Users\dinesh_pundkar\Desktop>python c.py
[1, 2, 3]
Max number is - 3
Line number is - 3
C:\Users\dinesh_pundkar\Desktop>
Iterate through the file and keep track of the highest number you've seen and the line you found it on. Just replace this with the new number and new line number when you see a bigger one.
b = open('file.txt', 'r')
max = -1
lineNum = -1
line = b.readline()
index = 0
while(line):
index+=1
newNum = line[0]
if(newNum>max):
max = newNum
lineNum = index
line = b.readline()
print lineNum,max,index
max is your highest number, lineNum is where it was, and index is the number of lines in the file
How would I repeat this (excluding the opening of the file and the setting of the variables)?
this is my code in python3
file = ('file.csv','r')
count = 0 #counts number of times i was equal to 1
i = 0 #column number
for line in file:
line = line.split(",")
if line[i] == 1:
count = count + 1
i = i+1
If I understand the question, try this and adjust for however you want to format. Replace NUM_COLUMNS with the number of times you want it repeating
file = open('file.csv','r')
data = file.readlines()
for i in range(NUM_COLUMNS):
count = 0
for line in data:
line = line.split(",")
if line[i] == ("1"):
count = count + 1
print count
The following function will return the number of fields in the csv file file_name whose value is field_value, which is what I think you are trying to do:
import csv
def get_count(file_name, field_value):
count = 0
with open(file_name) as f:
reader = csv.reader(f)
for row in reader:
count += row.count(field_value)
return count
print(get_count('file.csv', '1'))
So what I have to do is figure out how to ask a user for a a text file and output the average of all the numbers. By tinkering around I have figured out how to find the average of a list i made but not sure how to do a list that the user give me. This is what I have right now:
with open('average', 'wt') as myFile:
myFile.write('3\n')
myFile.write('45\n')
myFile.write('83\n')
myFile.write('21\n')
with open('average', 'rt') as myFile:
total, n = 0, 0
for line in myFile:
total += int(line)
n += 1
print(float(total) / n)
Supposing that there is one number on each line of the file:
with open(input('Filename: '), 'r') as f:
numbers = [int(a.strip()) for a in f]
print('Average is {}'.format(sum(numbers)/len(numbers)))
Something like this?
import string
fileName = raw_input("What file name: ")
lines = []
try:
file = open(fileName)
lines = file.readlines()
file.close()
except:
print "Unable to open file"
sum = 0
values = 0
if(len(lines) > 0):
for line in lines:
value = 0
try:
value = int(string.strip(line))
except ValueError:
pass
if(value != 0):
sum = sum + value
values += 1
print "Average = %f for %d lines, sum = %f"%(sum/values,values,sum)
else:
print "No lines in the file"
NOTE: This assumes one number per line. It will not count blank lines or lines that have text. Other than that, junk on the lines or a bad file should not cause an exception, etc.
This was the test file (there are blank lines):
10
20
30
40
50
23
5
asdfadfs
s
And the output:
What file name: numbers.txt
Average = 25.000000 for 7 lines, sum = 178.000000
im writing a python function to do the following, add numbers from each line, so i can then find the average. this is what my file looks like:
-2.7858521
-2.8549764
-2.8881847
2.897689
1.6789098
-0.07865
1.23589
2.532461
0.067825
-3.0373958
Basically ive written a program that does a for loop for each line, incrementing the counter of lines and setting each line to a float value.
counterTot = 0
with open('predictions2.txt', 'r') as infile:
for line in infile:
counterTot += 1
i = float(line.strip())
now is the part i get a lil stuck
totalSum =
mean = totalSum / counterTot
print(mean)
As you can tell im new to python, but i find it very handy for text analysis work, so im getting into it.
Extra function
I was also looking into an extra feature. but should be a seperate function as above.
counterTot = 0
with open('predictions2.txt', 'r') as infile:
for line in infile:
counterTot += 1
i = float(line.strip())
if i > 3:
i = 3
elif i < -3:
i = -3
As you can see from the code, the function decides if a number is bigger than 3, if so, then make it 3. If number is smaller than -3, make it -3. But im trying to output this to a new file, so that it keeps its structure in tact. For both situations i would like to keep the decimal places. I can always round the output numbers myself, i just need the numbers intact.
You can do this without loading the elements into a list by cheekily using fileinput and retrieve the line count from that:
import fileinput
fin = fileinput.input('your_file')
total = sum(float(line) for line in fin)
print total / fin.lineno()
You can use enumerate here:
with open('predictions2.txt') as f:
tot_sum = 0
for i,x in enumerate(f, 1):
val = float(x)
#do something with val
tot_sum += val #add val to tot_sum
print tot_sum/i #print average
#prints -0.32322842
I think you want something like this:
with open('numbers.txt') as f:
numbers = f.readlines()
average = sum([float(n) for n in numbers]) / len(numbers)
print average
Output:
-0.32322842
It reads your numbers from numbers.txt, splits them by newline, casts them to a float, adds them all up and then divides the total by the length of your list.
Do you mean you need to change 5.1234 to 3.1234 and -8.5432 to -3.5432 ?
line = " -5.123456 "
i = float(line.strip())
if i > 3:
n = int(i)
i = i - (n - 3)
elif i < -3:
n = int(i)
i = i - (n + 3)
print(i)
it give you
-3.123456
Edit:
shorter version
line = " -5.123456 "
i = float(line.strip())
if i >= 4:
i -= int(i) - 3
elif i <= -4:
i -= int(i) + 3
print(i)
Edit 2:
If you need to change 5.1234 to 3.0000 ("3" and 4x "0") and -8.7654321 to -3.0000000 ("-3" and 7x "0")
line = " -5.123456 "
line = line.strip()
i = float(line)
if i > 3:
length = len(line.split(".")[1])
i = "3.%s" % ("0" * length) # now it will be string again
elif i < -3:
length = len(line.split(".")[1])
i = "-3.%s" % ("0" * length) # now it will be string again
print(i)
Here is a more verbose version. You could decide to replace invalid lines (if any) by a neutral value instead of ignoring it
numbers = []
with open('myFile.txt', 'r') as myFile:
for line in myFile:
try:
value = float(line)
except ValueError, e:
print line, "is not a valid float" # or numbers.append(defaultValue) if an exception occurred
else:
numbers.append(value)
print sum(numbers) / len(numbers)
For your second request here is the most straightforward solution (more solutions here)
def clamp(value, lowBound, highBound):
return max(min(highBound, value), lowBound)
Applying it to our list:
clampedValues = map(lambda x: clamp(x, -3.0, 3.0), numbers)