Reading and adding numbers from a text document in Python - python

I have a text document with a list of numbers, I need to have a function that reads all of them, and adds them together so then I can average them all out. I'm able to print the individual numbers and count the total numbers but not add the actual numbers together. Currently my code gives me "ValueError: invalid literal for int() with base 10: ''"
def CalcAverage():
scoresFile = open("data.txt", "r")
line = scoresFile.readline()
scoreCounter = 1
scoreTotal = 0
while line != "":
line = scoresFile.readline()
total = int(line) + int(line)
scoreCounter = int(scoreCounter) + 1
print(total)
scoresFile.close()

Assuming this is the error you're getting, replicated in the REPL:
>>> int("")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: ''
Then the issue is that you're trying to cast an empty string into an int. This will naturally result in an error, and is a common one if you have empty lines in your file.
You sorta check against this:
while line != "":
But the issue with this is where you assign line: immediately after this check. While your first line value will be checked, it will be immediately replaced by a new line = scoresFile.readline().
Thus, as you enter your while loop, line is the first line in your file. Then, it will assign the second line to line, dropping the first line's value. This will cause two different errors - one where you attempt in the following line to cast an empty string to an int, and one where you are ignoring the value of your first line. You need to alter how you do your check.
def calc_average():
with open("data.txt", "r") as scores:
counter = 0
total = 0
for line in scores.readlines(): # Returns the lines as a list
total += int(line)
counter += 1
print(f"{total=} {counter=}")
Note these following good-hygeine practices in Python:
Name your variables and functions using snake case.
Use with to open a file. This calls close automatically, so you don't risk forgetting it.

Related

Error in Python Code Trying To Open and Access a File

Here is info from the .txt file I am trying to access:
Movies: Drama
Possession, 2002
The Big Chill, 1983
Crimson Tide, 1995
Here is my code:
fp = open("Movies.txt", "r")
lines = fp.readlines()
for line in lines:
values = line.split(", ")
year = int(values[1])
if year < 1990:
print(values[0])
I get an error message "IndexError: list index out of range". Please explain why or how I can fix this. Thank you!
Assuming your .txt file includes the "Movies: Drama" line, as you listed, it's because the first line of the text file has no comma in it. Therefore splitting that first line on a comma only results in 1 element (element 0), NOT 2, and therefore there is no values[1] for the first line.
It's not unusual for data files to have a header line that doesn't contain actual data. Import modules like Pandas will typically handle this automatically, but open() and readlines() don't differentiate.
The easiest thing to do is just slice your list variable (lines) so you don't include the first line in your loop:
fp = open("Movies.txt", "r")
lines = fp.readlines()
for line in lines[1:]:
values = line.split(", ")
year = int(values[1])
if year < 1990:
print(values[0])
Note the "lines[1:]" modification. This way you only loop starting from the second line (the first line is lines[0]) and go to the end.
The first line of the text file does not have a ", ", so when you split on it, you get a list of size 1. When you access the 2nd element with values[1] then you are accessing outside the length of the array, hence the IndexError. You need to do a check on the line before making the assumption about the size of the list. Some options:
Check the length of values and continue if it's too short.
Check that ', ' is in the line before splitting on it.
Use a regex which will ensure the ', ' is there as well as can ensure that the contents after the comma represent a number.
Preemptively strip off the first line in lines if you know that it's the header.
Your first line of your txt file has wrong index
Just simple change your code to:
fp = open("Movies.txt", "r")
lines = fp.readlines()
for line in lines:
try: #<---- Here
values = line.split(", ")
year = int(values[1])
if year < 1990:
print(values[0])
except: #<--------And here
pass

Calculating with data from a txt file in python

Detailed:
I have a set of 200 or so values in a txt file and I want to select the first value b[0] and then go through the list from [1] to [199] and add them together.
So, [0]+[1]
if that's not equal to a certain number, then it would go to the next term i.e. [0]+[2] etc etc until it's gone through every term. Once it's done that it will increase b[0] to b[1] and then goes through all the values again
Step by step:
Select first number in list.
Add that number to the next number
Check if that equals a number
If it doesn't, go to next term and add to first term
Iterate through these until you've gone through all terms/ found
a value which adds to target value
If gone through all values, then go to the next term for the
starting add value and continue
I couldn't get it to work, if anyone can maybe provide a solution or give some advice? Much appreciated. I've tried looking at videos and other stack overflow problems but I still didn't get anywhere. Maybe I missed something, let me know! Thank you! :)
I've attempted it but gotten stuck. This is my code so far:
b = open("data.txt", "r")
data_file = open("data.txt", "r")
for i, line in enumerate(data_file):
if (i+b)>2020 or (i+b)<2020:
b=b+1
else:
print(i+b)
print(i*b)
Error:
Traceback (most recent call last):
File "c:\Users\███\Desktop\ch1.py", line 11, in <module>
if (i+b)>2020 or (i+b)<2020:
TypeError: unsupported operand type(s) for +: 'int' and '_io.TextIOWrapper'
PS C:\Users\███\Desktop>
I would read the file into an array and then convert it into ints before
actually dealing with the problem. files are messy and the less we have to deal with them the better
with open("data.txt", "r") as data_file:
lines = data_file.readlines() # reads the file into an array
data_file.close
j = 0 # you could use a better much more terse solution but this is easy to understand
for i in lines:
lines[j] = int(i.strip().replace("\n", ""))
j += 1
i, j = 0
for i in lines: # for every value of i we go through every value of j
# so it would do x = [0] + [0] , [0] + [1] ... [1] + [0] .....
for j in lines:
x = j + i
if x == 2020:
print(i * j)
Here are some things that you can fix.
You can't add the file object b to the integer i. You have to convert the lines to int by using something like:
integer_in_line = int(line.strip())
Also you have opened the same file twice in read mode with:
b = open("data.txt", "r")
data_file = open("data.txt", "r")
Opening it once is enough.
Make sure that you close the file after you used it:
data_file.close()
To compare each number in the list with each other number in the list you'll need to use a double for loop. Maybe this works for you:
certain_number = 2020
data_file = open("data.txt", "r")
ints = [int(line.strip()) for line in data_file] # make a list of all integers in the file
for i, number_at_i in enumerate(ints): # loop over every integer in the list
for j, number_at_j in enumerate(ints): # loop over every integer in the list
if number_at_i + number_at_j == certain_number: # compare the integers to your certain number
print(f"{number_at_i} + {number_at_j} = {certain_number}")
data_file.close()
Your problem is the following: The variables b and data_file are not actually the text that you are hoping they are. You should read something about reading text files in python, there are many tutorials on that.
When you call open("path.txt", "r"), the open function returns a file object, not your text. If you want the text from the file, you should either call read or readlines. Also it is important to close your file after reading the content.
data_file = open("data.txt", "r") # b is a file object
text = data_file.read() # text is the actual text in the file in a single string
data_file.close()
Alternatively, you could also read the text into a list of strings, where each string represents one line in the file: lines = data_file.readlines().
I assume that your "data.txt" file contains one number per line, is that correct? In that case, your lines variable will be a list of the numbers, but they will be strings, not integers or floats. Therefore, you can't simply use them to perform calculation. You would need to call int() on them.
Here is an example how to do it. I assumed that your textfile looks like this (with arbitary numbers):
1
2
3
4
...
file = open("data.txt", "r")
lines = file.readlines()
file.close()
# This creates a new list where the numbers are actual numbers and not strings
numbers = []
for line in lines:
numbers.append(int(line))
target_number = 2020
starting_index = 0
found = False
for i in range(starting_index, len(numbers)):
temp = numbers[i]
for j in range(i + 1, len(numbers)):
temp += numbers[j]
if temp == target_number:
print(f'Target number reached by adding nubmers from {i} to {j}')
found = True
break #This stops the inner loop.
if found:
break #This stops the outer loop

How to set range index in for loop

Im creating a job-script to use dayli, and trying to tranform a line-sequential txt to a data-line txt.
Ive already reach the index of header lines and also get the 'laborer code' for each header.
for line_no, line in enumerate(data):
if line[0:10] == 'FUNCIONARI':
code = int(line[11:18])
# print(line_no)
else:
line_no = -1
for line in range(index=line_no, 32, 1):
line += code
print(line)
Pycharm return that: "SyntaxError: positional argument follows keyword argument"
How could I repeat the code gattered upper, to the end of next 32 lines?
Solved!
code = 0
for line in data:
if line[:10] == 'FUNCIONARIO':
code = line[11:16]
line = code, line
Any time line receives the code. So when the code gets new number, its goes down in line till next change.

How to append 3 latest results onto the user’s name in Python?

I am using Python to generate a maths quiz and store the results from the quiz along with the user's name in a text file. I plan to store the last three scores of the user by appending onto the user's name if it already exists in the text file and deleting older scores if there are more than three. This is what I have done so far:
with open("class%s.txt" % classnumber, "a") as my_class:
my_class.write("{0}\n".format([name, score]))
with open("class%s.txt" % classnumber, "r+")as file:
file.seek(0)
scores = file.readlines()
user_scores = {}
for line in scores:
name, score = line.rstrip('\n').split(' - ')
score = int(score)
if name not in user_scores:
user_scores[name] = []
user_scores[name].append(score)
if len(user_scores[name]) > 3:
user_scores[name].pop(0)
when I try to run this code, the following error message pops up:
Traceback (most recent call last):
File "H:\Documents\Documents\Computing\Programming tasks\task 2 - Copy
(2).py", line 56, in <module>
name, score = line.rstrip('\n').split(' - ')
ValueError: need more than 1 value to unpack
Can anyone help me understand what I am doing wrong please.
It's trying to unpack line.rstrip('\n').split(' - ') into two values, putting one into name and the other into score. However, it only found one thing in there, so it doesn't have anything to assign to score. Somewhere in a classn.txt file is a line that doesn't come in the form x - y, so when you try to split() it, it only produces one value.
You also shouldn't need to rstrip() the line as far as I can tell, because it's coming from readlines(), which basically splits the file on '\n' anyway.

How to get to an item in a python enumerate()

I'm new to python.
I have two files, one that contains symbols (words) and another that is a map file.
Both files are text files. The map file does contain form feeds.
I want to find the line in the map file that is one above the line that contains
a symbol in the map file.
I have the following code.
Osymbolfile="alistofsymbols"
mapfile="amapfile"
maplines = (line.rstrip('\n\f') for line in open(mapfile))
for line in Osymbolfile:
line = (line.rstrip('\n') )
print "line= ",line
linecount = 0
for index, scanline in enumerate(maplines):
if line in scanline:
print "scanline=",scanline
print "index=",index
else:
linecount = linecount + 1
print "= ",linecount
After print "index=",index ,I've tried print maplines[index-1], but I get an error.
How do I obtain the line above the index'th line in maplines?
Your maplines object is a generator expression; these produce items on demand and are not indexable.
You could use a list comprehension instead:
maplines = [line.rstrip('\n\f') for line in open(mapfile)]
Now you have a indexable list object instead. Even better, you can now loop over this object multiple times; you cannot do that with a generator.
The proper way to handle your case however, is to store the previous line only:
with open(mapfile) as maplines:
prev = None
for line in maplines:
if something in line:
return prev
prev = line

Categories