How to process words and numbers in a file in python - python

Hello I'm a few weeks into python and now learning files. I've made the program be able to sum the numbers in the file if there were only numbers but now there are numbers aswell as words. How do I make it ignore the words and make it sum to 186?
def sum_numbers_in_file(filename):
"""reads all the numbers in a file and returns the sum of the numbers"""
filename = open(filename)
lines = filename.readlines()
result = 0
for num in lines:
result = result + int(num)
num.rstrip()
filename.close()
return result
answer = sum_numbers_in_file('sum_nums_test_01.txt')
print(answer)
This is in the file:
1
Pango
2
Whero
3
4
10
Kikorangi
20
40
100
-3
4
5

You can easily add a try-except statement inside the function to make it work only on numbers:
def sum_numbers_in_file(filename):
"""reads all the numbers in a file and returns the sum of the numbers"""
filename = open(filename)
lines = filename.readlines()
result = 0
for num in lines:
try:
result = result + int(num)
num.rstrip()
except ValueError:
pass
filename.close()
return result
answer = sum_numbers_in_file('sum_nums_test_01.txt')
print(answer)
Or you can use the isalpha method:
def sum_numbers_in_file(filename):
"""reads all the numbers in a file and returns the sum of the numbers"""
filename = open(filename)
lines = filename.readlines()
result = 0
for num in lines:
num = num.rstrip()
if not num.isalpha():
result = result + int(num)
filename.close()
return result
answer = sum_numbers_in_file('sum_nums_test_01.txt')
print(answer)
The isalpha() returns true only if the string doesn't contain symbols or numbers, so you can use it to check if the string is a number. Also works on decimal numbers.
Note that it also detects symbols as numbers, so if there's a symbol in the line it will count that as a number, potentially generating errors!

You can use a try-except block, an advanced yet effective way of preventing errors. Add this in your for loop:
try:
result += int(num)
except: pass
Normally it's a good practice to add something in the except clause but we don't want anything so we just pass. The trymeans we try but if we fail we go to the except part.

I would suggest using a try/except block:
with open("words.txt") as f:
nums = []
for l in f:
try:
nums.append(float(l))
except ValueError:
pass
result = sum(nums)
A simple one-liner that you could implement to get all numerical values if you want an alternative would be:
with open("words.txt") as f:
nums = [float(l.strip()) for l in f if not l.strip().isalpha()]
result = sum(nums)
Here, I convert each line into a float and append that value to the nums list. If the line is not a numerical value, it will simply just be passed over, hence pass.
You cannot use .isnumeric() as it will only work for strings that contain only integers. This means no decimals or negative numbers.

Here are couple of way's you can try using isdigit,
value = 0
with open("sum_nums_test_01.txt") as f:
for l in f.readlines():
if l.strip().isdigit():
value += int(l)
with open("sum_nums_test_01.txt") as f:
value = sum(int(f) for f in f.readlines() if f.strip().isdigit())

Related

My code doesn't work for more then 2 digits numbers and and negative numbers

Basically, this is my task. Extract numbers from a text file and then calculate the sum of them.
I wrote the code successfully and but it doesn't work fine with 2 or more digit numbers and negative numbers. What should i do?
f = open('file6.txt', 'r')
suma = 0
file = f.readlines()
for line in file:
for i in line:
if i.isdigit() == True:
suma += int(i)
print("The sum is ", suma)
file6.txt:
1
10
Output:
The sum is 2
In your case, you are going line by line first through the loop and looking at every digit ( in second loop ) to add.
And /n at the end of elements make the .isDigit() function disabled to find the digits.
So your updated code should be like this :
f = open('file6.txt', 'r')
suma = 0
file = f.readlines()
for line in file:
if line.strip().isdigit():
suma += int(line)
print("The sum is ", suma)
Hope it helps!
Use re.split to split the input into words on anything that is not part of a number. Try to convert the words into numbers, silently skip if this fails.
import re
sum_nums_in_file = 0
with open('file6.txt') as f:
for line in f:
for word in re.split(r'[^-+\dEe.]+', line):
try:
num = float(word)
sum_nums_in_file += num
except:
pass
print(f"The sum is {sum_nums_in_file}")
This works for example on files such as this:
-1 2.0e0
+3.0

How do I iterate through a file containing a list of floating point numbers in Python?

I need to write a program that reads a file containing a list of floating-point numbers and counts how many of those numbers are larger than a user-specified threshold.
numbers.txt -
5.0
15.0
25.0
This is my python code -
in_file = open("numbers.txt", "r")
number = float(in_file.read()) # error in python
user_input = float(input("Threshold: "))
if number > user_input:
print(number)
in_file.close()
Python is unable to convert the string to a float because the numbers have a new line after each number and python is trying to convert that into a float. I tried to change line 2 in my code to add a strip method but it still comes up with the same error.
for line in open("numbers.txt", "r"):
line = line.replace("\n","")
num = float(line)
im sure you can continue from here..
Try this:
with open('numbers.txt') as fp:
lst = [float(line.strip()) for line in fp if line.strip()]
user_input = float(input("Threshold: "))
for num in lst:
if num > user_input:
print(num)
You can try this workaround
inputdata = []
with open('data.txt') as f:
for row in f:
try:
number = float(row)
inputdata.append(number)
except:
pass

How to extract all the numbers from a text file using re.findall() and compute the sum using a for-loop?

The basic outline of this problem is to read the file, look for integers using the re.findall(), looking for regular expression of [0-9]+ and then converting the extracted strings to integers and summing up the integers. I'm having different outcome it supposed to end with (209). Also, how can I simplify my code? Thanks (here is the txt file http://py4e-data.dr-chuck.net/regex_sum_167791.txt)
import re
hand = open("regex_sum_167791.txt")
total = 0
count = 0
for line in hand:
count = count+1
line = line.rstrip()
x = re.findall("[0-9]+", line)
if len(x)!= 1 : continue
num = int(x[0])
total = num + total
print(total)
Assuming that you need to sum all the numbers in your txt:
total = 0
with open("regex_sum_167791.txt") as f:
for line in f:
total += sum(map(int, re.findall("\d+", line)))
print(total)
# 417209
Logics
To start with, try using with when you do open so that once any job is done, open is closed.
Following lines are removed as they seemed redundant:
count = count+1: Not used.
line = line.rstrip(): re.findall takes care of extraction, so you don't have to worry about stripping lines.
if len(x)!= 1 : continue: Seems like you wanted to skip the line with no digits. But since sum(map(int, re.findall("\d+", line))) returns zero in such case, this is also unnecessary.
num = int(x[0]): Finally, this effectively grabs only one digit from the line. In case of two or more digits found in a single line, this won't serve the original purpose. And since int cannot be directly applied to iterables, I used map(int, ...).
You were almost there:
import re
hand = open("regex_sum_167791.txt")
total = 0
for line in hand:
count = count+1
line = line.rstrip()
x = re.findall("[0-9]+", line)
for i in x:
total += int(i)
print(total)
Answer: 417209

Regular Expressions assignment

I am taking an online class for python. I've been at this for 2 weeks. I've written the following code to find numbers in a sample text document. The problem I'm having is when I move from line to line and run the regex, it finds the first set of numbers, then skips any remaining numbers on the line and moves to the next line where it matches only the first number on the line. My code is below:
#!/usr/bin/python
import re
try:
fname = raw_input("Enter file name: ")
fh = open(fname)
except:
print 'Invalid Input'
quit()
numlist = list()
for line in fh:
nums = re.findall('[0-9]+',line)
if len(nums) < 1 : continue
num = int(nums[0])
numlist.append(num)
print (numlist)
you are explicitly telling it to skip all numbers but the first:
num = int(nums[0])
instead, use a list comprehension to coerce to int and append the entire list using extend().
numlist.extend([int(x) for x in num])
As others already noted, you're discarding all other numbers in the list and taking only the first element. You can use the map function to convert the numbers to int and then extend the list
for line in fh:
nums = re.findall('[0-9]+',line)
if len(nums) < 1 : continue
nums = map(int, nums)
numlist.extend(nums)
The problem is that you're not looping on nums, but only appending the first item in the nums list.
To solve this, you should iterate on nums and append each item.

How to sum numbers from a text file in Python

I have a code that relies on me reading a text file, printing off the numbers where there are numbers, printing off specific error messages where there are strings instead of numbers, then summing ALL the numbers up and printing their sum (then saving ONLY the numbers to a new text file).
I have been attempting this problem for hours, and I have what is written below.
I do not know why my code does not seem to be summing up properly.
And the python code:
f=open("C:\\Users\\Emily\\Documents\\not_just_numbers.txt", "r")
s=f.readlines()
p=str(s)
for line in s:
printnum=0
try:
printnum+=float(line)
print("Adding:", printnum)
except ValueError:
print("Invalid Literal for Int() With Base 10:", ValueError)
for line in s:
if p.isdigit():
total=0
for number in s:
total+=int(number)
print("The sum is:", total)
I have a code that relies on me reading a text file, printing off the
numbers where there are numbers, printing off specific error messages
where there are strings instead of numbers, then summing ALL the
numbers up and printing their sum (then saving ONLY the numbers to a
new text file).
So you have to do the following:
Print numbers
Print a message when there isn't a number
Sum the numbers and print the sum
Save only the numbers to a new file
Here is one approach:
total = 0
with open('input.txt', 'r') as inp, open('output.txt', 'w') as outp:
for line in inp:
try:
num = float(line)
total += num
outp.write(line)
except ValueError:
print('{} is not a number!'.format(line))
print('Total of all numbers: {}'.format(total))
This is a very short way to sum all numbers in your file (you will have to add try and except)
import re
print(sum(float(num) for num in re.findall('[0-9]+', open("C:\\Users\\Emily\\Documents\\not_just_numbers.txt", 'r').read())))
You are checking the wrong condition:
for line in s:
if p.isdigit():
p is this:
s=f.readlines()
p=str(s)
Being a strified version of a list, it will start with a '[', and hence p.isdigit() will always be false. You instead want to check line.isdigit(), and you want to only initialise total once instead of each time around the loop:
total = 0
for line in f:
if line.isdigit():
total += int(line)
Note that by iterating over f directly, you also don't need to ever call readlines().
Here is what you can do:
data.txt:
1
2
hello
3
world
4
code:
total = 0
with open('data.txt') as infile:
with open('results.txt', 'w') as outfile:
for line in infile:
try:
num = int(line)
total += num
print(num, file=outfile)
except ValueError:
print(
"'{}' is not a number".format(line.rstrip())
)
print(total)
--output:--
'hello' is not a number
'world' is not a number
10
$ cat results.txt
1
2
3
4
you can also try this:
f=open("C:\\Users\\Emily\\Documents\\not_just_numbers.txt", "r")
ww=open("C:\\Users\\Emily\\Documents\\not_just_numbers_out.txt", "w")
s=f.readlines()
p=str(s)
for line in s:
#printnum=0
try:
#printnum+=float(line)
print("Adding:", float(line))
ww.write(line)
except ValueError:
print("Invalid Literal for Int() With Base 10:", ValueError)
total=0
for line in s:
if line.strip().isdigit():
total += int(line)
print("The sum is:", total)
here str.strip([chars]) means
Return a copy of the string with the leading and trailing characters removed. The chars argument is a string specifying the set of characters to be removed. If omitted or None, the chars argument defaults to removing whitespace. The chars argument is not a prefix or suffix; rather, all combinations of its values are stripped
Every time you enter a new line you reset the total to zero if the number is a digit.
You might want your total to initialize before you enter the loop.
I tried debugging the for loop using the isdigit and isalpha
apparently every new line is not considered a digit or alphanumeric these always evaluate to false
As it turns out you don't need the for loop you've done most of the program with your try except statement
Here's how I did it on my system.
f = open("/home/david/Desktop/not_just_numbers.txt", 'r')
s = f.readlines()
p = str(s)
total = 0
for line in s:
#print(int(line))
printnum = 0
try:
printnum += float(line)
total += printnum
#print("Adding: ", printnum)
except ValueError:
print("Invalid Literal for Int() With Base 10:", ValueError)
print("The sum is: ", total)
$ echo -e '1/n2/n3/n4/n5' | python -c "import sys; print sum(int(l) for l in sys.stdin)"
Read from file containing numbers separated by new lines:
total = 0
with open("file_with_numbers.txt", "r") as f:
for line in f:
total += int(line)
print(total)

Categories