Computing a text document of numbers. Python - python

So I'm trying to calculate the sum and average of a text document filled with 10000 numbers.ยจ
This is my code:
with open("\\Users\\saksa\\python_courses\\1DV501\\assign3\\file_10000integers_A.txt", "r") as f:
total = 0
number_of_ints = 0
for line in f:
for i in line:
if i.isdigit() == True:
total += int(i)
number_of_ints +=1
print (total)
print (number_of_ints)
The document is formated like: 215, 631, 731, 225, 315, etc in multiple lines
The problem is that it reads every number 1 by 1. So 100 becomes 1 + 0 + 0.
I think I need to use split to make it work but I cant figure out how to.

You can iterate over each line in a file.
And you can split each line by a ',' and add each number
with open("\\Users\\saksa\\python_courses\\1DV501\\assign3\\file_10000integers_A.txt", "r") as f:
total = 0
number_of_ints = 0
for line in f:
print(line)
for num in line.split(','):
print(num)
total += int(num)
number_of_ints += 1
print(total)
print(number_of_ints)
You will need to add some logic to ensure the numbers are numbers

You should use split() function:
line = "110, 23400, 34569, 23567"
line.replace(" ", "") # Get rid of unnecessary spacebars
total = 0
number_of_ints = 0
for i in line.split(","):
# Here you can do whatever you want
try:
i = int(i)
total += int(i)
number_of_ints += 1
except:
pass

Related

My code doesn't work for more then 2 digits numbers and and negative numbers

Basically, this is my task. Extract numbers from a text file and then calculate the sum of them.
I wrote the code successfully and but it doesn't work fine with 2 or more digit numbers and negative numbers. What should i do?
f = open('file6.txt', 'r')
suma = 0
file = f.readlines()
for line in file:
for i in line:
if i.isdigit() == True:
suma += int(i)
print("The sum is ", suma)
file6.txt:
1
10
Output:
The sum is 2
In your case, you are going line by line first through the loop and looking at every digit ( in second loop ) to add.
And /n at the end of elements make the .isDigit() function disabled to find the digits.
So your updated code should be like this :
f = open('file6.txt', 'r')
suma = 0
file = f.readlines()
for line in file:
if line.strip().isdigit():
suma += int(line)
print("The sum is ", suma)
Hope it helps!
Use re.split to split the input into words on anything that is not part of a number. Try to convert the words into numbers, silently skip if this fails.
import re
sum_nums_in_file = 0
with open('file6.txt') as f:
for line in f:
for word in re.split(r'[^-+\dEe.]+', line):
try:
num = float(word)
sum_nums_in_file += num
except:
pass
print(f"The sum is {sum_nums_in_file}")
This works for example on files such as this:
-1 2.0e0
+3.0

Summing and Average using python

I have written a code that extracts floating point numbers from a
text file and produces a list of the numbers.
My challenge is summing the consecutive numbers and finding the
average of the numbers.
I am not allowed to use the sum function and I am new to python ..
this the code I have written so far ,
what can I do to add through the list
fh = open(fname)
for line in fh:
if line.startswith("X-DSPAM-Confidence:") : continue
# print(line)
count = 0
for line in fh:
if line.startswith("X-DSPAM-Confidence:"):
count = count + 1
# print(count)
for line in fh:
if line.startswith("X-DSPAM-Confidence:"):
# print(line)
xpos = line.find(' ')
# print(xpos)
num = line[xpos : ]
# print(float(num))
fnum = float(num)
# print(fnum)
total = 0
for i in fnum:
total += int(i)
print(total)
Error:"float object not iterable on line 24" ... line 24 is the 4th for loop
First an open file is iterable only once, and your code shows 4 loops starting with for line in fh:. After first loop, the file pointer will reach the end of file, and the following loops should immediately return. For that reason with should be prefered.
Next somewhere in the loop you get a float value in fnum. Just initialize total before starting the loop, and add fnum when you get it:
total = 0
with open(fname) as fh:
for line in fh:
if line.startswith("X-DSPAM-Confidence:"):
# print(line)
xpos = line.find(' ')
# print(xpos)
num = line[xpos : ]
# print(float(num))
fnum = float(num)
# print(fnum)
total += fnum
# print(total)
with ensures that the file will be cleanly closed at the end of the loop.
fnum is a float. It's not an array, therefore it's not iterable and cannot be iterated in a for loop.
You probably don't need an array to determine the total and the average:
fname = "c:\\mbox-short.txt"
fh = open(fname)
count = 0
total = 0
for line in fh:
if line.startswith("X-DSPAM-Confidence:"):
xpos = line.find(' ')
num = line[xpos : ]
fnum = float(num)
total += fnum
count += 1
print("Total = " + str(total))
print("Average = " + str(total / count))
print("Number of items = " + str(count))
You don't have to use startsWith in this case. Better to use split for each line of the file. Each line will split all the words to a list. Then using the indexes you look for, X-DSPAM-Confidence:. If it exists then take the corresponding value of interest. In this case it is index number 1. Below is the code:
total = 0
number_of_items = 0
with open("dat.txt", 'r') as f:
for line in f:
fields = line.split()
if fields != []:
if fields[0] == "X-DSPAM-Confidence:":
number_of_items += 1
total += float(fields[1])
print(total)
print(number_of_items)
avg = (total/number_of_items)
print(avg)
I saved your data in a text file names, "dat.txt".
Hope it helps !!!

How to find largest number in file and see on which line it is

The file looks like this:
1, a b
2, c d
3, e f
my current code
b = open('file.txt', 'r')
c = b.readlines()
regels = len(c)
print(regels)
I got the numbers of lines but still need biggest number + on which line it is.
So you are just looking to find the biggest number in the first column of the file? This should help
b = open('file.txt', 'r')
c = b.readlines()
regels = len(c)
print(regels)
max = 0
for line in b.readlines():
num = int(line.split(",")[0])
if (max < num):
max = num
print(max)
# Close file
b.close()
This is how I'd go about doing it.
max_num = 0
with open('file.txt', 'r') as data: # use the with context so that the file closes gracefully
for line in data.readlines(): # read the lines as a generator to be nice to my memory
try:
val = int(line.split(",")[0])
except ValueError: # just incase the text file is not formatted like your example
val = 0
if val > max_num: # logic
max_num = val
print max_num #result
You need loop over each line in file, parse each line and find the largest number.
I do not quite understand how the numbers are stored in your file. Just assuming that in each line, the first field are numeric and separate with others (non-numeric) by ','. And I assume all numbers are integer.
ln = 0
maxln = 0
maxn = 0
with open(filename, 'r') as f:
line = f.next()
if line:
ln = 1
maxln = 1
maxn = int(line.split(",")[0].strip())
else:
raise Exception('Empty content')
for line in f:
ln += 1
cur = int(line.split(",")[0].strip())
if cur > maxn:
maxn = cur
maxln = ln
ln is used to record current line number, maxn is used to record current maximum number, and maxln is used to record current maximum number location.
One thing you need to do is fetch the first line to initialize these variables.
None of the answers give you the line of the max number so I'll post some quick code and refine later
max_num = 0
line_count = 0
with open('file.txt', 'r') as infile:
for line in infile:
number = int(line.split(',')[0])
if number > max_num:
max_num = number
line_num = line_count
line_count += 1
print (max_num)
print (line_num)
Read line
Split it on basis of comma
Append first element to temp list.
Once complete reading of file is done,
To get maximum number, just use max function on temp list.
Since file is read line by line sequentially and appending number from line to temp list, to get line number on which maximum number is present, just find the index of max number in temp list and increment it by one since list index starts with zero.
P.S : Check last three print statements
Code:
num_list = []
with open('master.csv','r')as fh:
for line in fh.readlines():
num_list.append(int((line.split(','))[0]))
print num_list
print "Max number is -" ,max(num_list)
print "Line number is - ", (num_list.index(max(num_list)))+1
Output:
C:\Users\dinesh_pundkar\Desktop>python c.py
[1, 2, 3]
Max number is - 3
Line number is - 3
C:\Users\dinesh_pundkar\Desktop>
Iterate through the file and keep track of the highest number you've seen and the line you found it on. Just replace this with the new number and new line number when you see a bigger one.
b = open('file.txt', 'r')
max = -1
lineNum = -1
line = b.readline()
index = 0
while(line):
index+=1
newNum = line[0]
if(newNum>max):
max = newNum
lineNum = index
line = b.readline()
print lineNum,max,index
max is your highest number, lineNum is where it was, and index is the number of lines in the file

Repeating a for in line loop python

How would I repeat this (excluding the opening of the file and the setting of the variables)?
this is my code in python3
file = ('file.csv','r')
count = 0 #counts number of times i was equal to 1
i = 0 #column number
for line in file:
line = line.split(",")
if line[i] == 1:
count = count + 1
i = i+1
If I understand the question, try this and adjust for however you want to format. Replace NUM_COLUMNS with the number of times you want it repeating
file = open('file.csv','r')
data = file.readlines()
for i in range(NUM_COLUMNS):
count = 0
for line in data:
line = line.split(",")
if line[i] == ("1"):
count = count + 1
print count
The following function will return the number of fields in the csv file file_name whose value is field_value, which is what I think you are trying to do:
import csv
def get_count(file_name, field_value):
count = 0
with open(file_name) as f:
reader = csv.reader(f)
for row in reader:
count += row.count(field_value)
return count
print(get_count('file.csv', '1'))

Calculating average of numbers in a file

So what I have to do is figure out how to ask a user for a a text file and output the average of all the numbers. By tinkering around I have figured out how to find the average of a list i made but not sure how to do a list that the user give me. This is what I have right now:
with open('average', 'wt') as myFile:
myFile.write('3\n')
myFile.write('45\n')
myFile.write('83\n')
myFile.write('21\n')
with open('average', 'rt') as myFile:
total, n = 0, 0
for line in myFile:
total += int(line)
n += 1
print(float(total) / n)
Supposing that there is one number on each line of the file:
with open(input('Filename: '), 'r') as f:
numbers = [int(a.strip()) for a in f]
print('Average is {}'.format(sum(numbers)/len(numbers)))
Something like this?
import string
fileName = raw_input("What file name: ")
lines = []
try:
file = open(fileName)
lines = file.readlines()
file.close()
except:
print "Unable to open file"
sum = 0
values = 0
if(len(lines) > 0):
for line in lines:
value = 0
try:
value = int(string.strip(line))
except ValueError:
pass
if(value != 0):
sum = sum + value
values += 1
print "Average = %f for %d lines, sum = %f"%(sum/values,values,sum)
else:
print "No lines in the file"
NOTE: This assumes one number per line. It will not count blank lines or lines that have text. Other than that, junk on the lines or a bad file should not cause an exception, etc.
This was the test file (there are blank lines):
10
20
30
40
50
23
5
asdfadfs
s
And the output:
What file name: numbers.txt
Average = 25.000000 for 7 lines, sum = 178.000000

Categories