I'm looking to have the program read a text file that is formatted like this for example.
Kristen
100
Maria
75
Frank
23
Is there anyway in python to skip lines and have it read only the numbers, accumulate them, and average them out? Could be more numbers or less numbers than the example above. I'm very much stuck.
you can use re.findall to find all numbers in a string:
import re
if __name__ == "__main__":
numbers = []
with open("./file.txt", "r") as f:
for line in f:
line = line.strip()
temp = list(map(lambda x: eval(x), re.findall(r'\d+', line)))
numbers += temp
average = sum(numbers) / len(numbers)
print(average)
This is the method I would use:
def get_average(filepath):
total = 0.0
with open(filepath, 'r') as f:
lines = f.readlines()
numbers = 0
for line in lines:
try:
number = int(line.strip())
total += number
numbers += 1
except:
continue
return total / float(numbers)
get_average("path/to/file.txt")
use strip to get rid of newline and isdigit to check for digit
In [8]: with open('a.txt', 'r') as f:
...: s = [int(i.strip()) for i in f if i.strip().isdigit()]
...:
In [9]: sum(s)/len(s)
Out[9]: 66.0
# assuming a score always follows a players name.
with open("path_to_file.ext", "r") as inf:
print(inf.readlines()[1::2]) # Do something with the result
# only grabbing lines that can be interpreted as numbers
with open("path_to_file.ext", "r") as inf:
for _ in inf.readlines():
if _.rstrip().isnumeric():
print(_.rstrip()) # Do something with the result
If the file name 'file.txt'
total = 0
i = 0
with open('file.txt', 'r') as file:
for line in file:
try:
total += int(line)
i += 1
except:
continue
average = total / i
Related
Basically, this is my task. Extract numbers from a text file and then calculate the sum of them.
I wrote the code successfully and but it doesn't work fine with 2 or more digit numbers and negative numbers. What should i do?
f = open('file6.txt', 'r')
suma = 0
file = f.readlines()
for line in file:
for i in line:
if i.isdigit() == True:
suma += int(i)
print("The sum is ", suma)
file6.txt:
1
10
Output:
The sum is 2
In your case, you are going line by line first through the loop and looking at every digit ( in second loop ) to add.
And /n at the end of elements make the .isDigit() function disabled to find the digits.
So your updated code should be like this :
f = open('file6.txt', 'r')
suma = 0
file = f.readlines()
for line in file:
if line.strip().isdigit():
suma += int(line)
print("The sum is ", suma)
Hope it helps!
Use re.split to split the input into words on anything that is not part of a number. Try to convert the words into numbers, silently skip if this fails.
import re
sum_nums_in_file = 0
with open('file6.txt') as f:
for line in f:
for word in re.split(r'[^-+\dEe.]+', line):
try:
num = float(word)
sum_nums_in_file += num
except:
pass
print(f"The sum is {sum_nums_in_file}")
This works for example on files such as this:
-1 2.0e0
+3.0
i would like to know how i could get all lines after the first in a python file
I've tried with this:
fr = open("numeri.txt", "r")
count = 0
while True:
line = fr.readline(count)
if line == "":
break
count += 1
print(line)
fr.close()
Could anyone help me? Thanks
You could add an extra if statement to check if count != 0 Since on the first loop it will be 0.
I don't know if i understood well, but to obtain all the lines skipping the first one you can simple do
lines = []
with open("numeri.txt") as fobj:
lines = fobj.readlines()[1:]
count = len(lines)+1 if lines else 0 # If you want to maintain the same counting as in your example
count = 0
with open(file, 'r') as file:
next(file.readline()) # skip the first line
for count, line in enumerate(file.readlines()): # read remaining lines with count
if not line: # If line equals "" this will be True
break
print(count, line)
count -= 1 # To ignore last lines count.
Just read the first line without using it:
with open('numeri.txt') as f:
f.readline()
lines = f.readlines()
print(*lines, sep='')
To ignore the first line you can also use next(f) (instead of f.readline()).
This is also fine:
with open('numeri.txt') as f:
lines = f.readlines()[1:]
print(*lines, sep='')
Try using l[1:]. It returns a subset of l that consist in the elements of l except the first position.
with open("numeri.txt", "r") as f:
content = f.readlines()[1:]
for line in content:
print(line.strip('\n')) # In order to avoid introduce double \n since print ends with a '\n'
EDIT: Based on #riccardo-bucco ' solution:
with open("numeri.txt", "r") as f:
content = f.readlines()[1:]
print(*content, sep='')
To print all but the first line:
with open('numeri.txt', 'r') as f:
output = ''.join(f.readlines()[1:])
print(output)
start count at 1 so it skips the first line
...
count = 1
...
I have written a code that extracts floating point numbers from a
text file and produces a list of the numbers.
My challenge is summing the consecutive numbers and finding the
average of the numbers.
I am not allowed to use the sum function and I am new to python ..
this the code I have written so far ,
what can I do to add through the list
fh = open(fname)
for line in fh:
if line.startswith("X-DSPAM-Confidence:") : continue
# print(line)
count = 0
for line in fh:
if line.startswith("X-DSPAM-Confidence:"):
count = count + 1
# print(count)
for line in fh:
if line.startswith("X-DSPAM-Confidence:"):
# print(line)
xpos = line.find(' ')
# print(xpos)
num = line[xpos : ]
# print(float(num))
fnum = float(num)
# print(fnum)
total = 0
for i in fnum:
total += int(i)
print(total)
Error:"float object not iterable on line 24" ... line 24 is the 4th for loop
First an open file is iterable only once, and your code shows 4 loops starting with for line in fh:. After first loop, the file pointer will reach the end of file, and the following loops should immediately return. For that reason with should be prefered.
Next somewhere in the loop you get a float value in fnum. Just initialize total before starting the loop, and add fnum when you get it:
total = 0
with open(fname) as fh:
for line in fh:
if line.startswith("X-DSPAM-Confidence:"):
# print(line)
xpos = line.find(' ')
# print(xpos)
num = line[xpos : ]
# print(float(num))
fnum = float(num)
# print(fnum)
total += fnum
# print(total)
with ensures that the file will be cleanly closed at the end of the loop.
fnum is a float. It's not an array, therefore it's not iterable and cannot be iterated in a for loop.
You probably don't need an array to determine the total and the average:
fname = "c:\\mbox-short.txt"
fh = open(fname)
count = 0
total = 0
for line in fh:
if line.startswith("X-DSPAM-Confidence:"):
xpos = line.find(' ')
num = line[xpos : ]
fnum = float(num)
total += fnum
count += 1
print("Total = " + str(total))
print("Average = " + str(total / count))
print("Number of items = " + str(count))
You don't have to use startsWith in this case. Better to use split for each line of the file. Each line will split all the words to a list. Then using the indexes you look for, X-DSPAM-Confidence:. If it exists then take the corresponding value of interest. In this case it is index number 1. Below is the code:
total = 0
number_of_items = 0
with open("dat.txt", 'r') as f:
for line in f:
fields = line.split()
if fields != []:
if fields[0] == "X-DSPAM-Confidence:":
number_of_items += 1
total += float(fields[1])
print(total)
print(number_of_items)
avg = (total/number_of_items)
print(avg)
I saved your data in a text file names, "dat.txt".
Hope it helps !!!
from collections import Counter
f = open('input.txt')
lines = f.readlines()
counter = 0
freq = []
for line in lines:
conv_int = int(line)
counter = counter + conv_int
freq.append(counter)
for i in freq:
print(Counter(freq))
print(counter)
This code loops through a text file with various negative and positive numbers and adds them together starting from zero. However I was wondering how to find how many times each number occurs in this file?
Your file has an integer on each line, and you want the total sum and the frequency of each integer, right? Try this.
from collections import Counter
with open("input.txt", "rt") as f:
total = 0
count = Counter()
for line in f:
conv_int = int(line)
total += conv_int
count[conv_int] += 1
print(count)
print(total)
Collection's Counter is expecting an iterable as an argument and not an item:
import collections
with open('input.txt', 'r') as input_file:
numbers = [int(line) for line in input_file]
numbers_sum = sum(numbers)
numbers_frequency = collections.Counter(numbers)
But if efficiency is not an issue for you and you're just trying to sum all numbers in a file and count their frequency, you don't need to import a library just to do that:
with open('input.txt', 'r') as input_file:
numbers = [int(line) for line in input_file]
numbers_sum = sum(numbers)
numbers_frequency = {n: numbers.count(n) for n in set(numbers)}
So what I have to do is figure out how to ask a user for a a text file and output the average of all the numbers. By tinkering around I have figured out how to find the average of a list i made but not sure how to do a list that the user give me. This is what I have right now:
with open('average', 'wt') as myFile:
myFile.write('3\n')
myFile.write('45\n')
myFile.write('83\n')
myFile.write('21\n')
with open('average', 'rt') as myFile:
total, n = 0, 0
for line in myFile:
total += int(line)
n += 1
print(float(total) / n)
Supposing that there is one number on each line of the file:
with open(input('Filename: '), 'r') as f:
numbers = [int(a.strip()) for a in f]
print('Average is {}'.format(sum(numbers)/len(numbers)))
Something like this?
import string
fileName = raw_input("What file name: ")
lines = []
try:
file = open(fileName)
lines = file.readlines()
file.close()
except:
print "Unable to open file"
sum = 0
values = 0
if(len(lines) > 0):
for line in lines:
value = 0
try:
value = int(string.strip(line))
except ValueError:
pass
if(value != 0):
sum = sum + value
values += 1
print "Average = %f for %d lines, sum = %f"%(sum/values,values,sum)
else:
print "No lines in the file"
NOTE: This assumes one number per line. It will not count blank lines or lines that have text. Other than that, junk on the lines or a bad file should not cause an exception, etc.
This was the test file (there are blank lines):
10
20
30
40
50
23
5
asdfadfs
s
And the output:
What file name: numbers.txt
Average = 25.000000 for 7 lines, sum = 178.000000