How to not hardcode function in this example - python

The following links contain 2 csv files that the function should pass through grades_1e_2a
grades_2e_4a
However my function is only able to pass the 2nd linked file, as it is hardcoded to range(4,8).
output: [91.5, 73.5, 81.5, 91.5]
The input file will start at the 4th element but may not necessarily end at the 8th element.
def class_avg(open_file):
'''(file) -> list of float
Return a list of assignment averages for the entire class given the open
class file. The returned list should contain assignment averages in the
order listed in the given file. For example, if there are 3 assignments
per student, the returned list should 3 floats representing the 3 averages.
'''
marks=[[],[],[],[]]
avgs = []
for line in open_file:
grades_list = line.strip().split(',')
for idx,i in enumerate(range(4,8)):
marks[idx].append(float(grades_list[i]))
for mark in marks:
avgs.append(float(sum(mark)/(len(mark))))
return avgs
How do I fix this so that my code will be able to read both files, or any file?
I have already opened the file and iterated past the first line with file.readline() in a previous function on the same file.
Thanks for everyone's help in advance.
Updated progress: https://gyazo.com/064dd0d695e3a3e1b4259a25d1b0b1a0

As both sets of your data start the same place the following works
for idx,i in enumerate(range(4,len(grades_list))):
This should fulfill all requirements that Im aware of up to this point
def class_avg(open_file):
'''(file) -> list of float
Return a list of assignment averages for the entire class given the open
class file. The returned list should contain assignment averages in the
order listed in the given file. For example, if there are 3 assignments
per student, the returned list should 3 floats representing the 3 averages.
'''
marks = None
avgs = []
for line in open_file:
grades_list = line.strip().split(',')
if marks is None:
marks = []
for i in range(len(grades_list) -4):
marks.append([])
for idx,i in enumerate(range(4,len(grades_list))):
marks[idx].append(int(grades_list[i]))
for mark in marks:
avgs.append(float(sum(mark)/(len(mark))))
return avgs

Try like this:
def class_avg(open_file, start = 4, end = 8):
...
...
for idx,i in enumerate(range(start, end)):

Try using this:
for idx,i in enumerate(range(4,len(grades_list))):
marks[idx].append(int(grades_list[i]))
Considering you know how many assignments are there and initialized the marks list accordingly.

Related

Extracting multiple data from a single list

I working on a text file that contains multiple information. I converted it into a list in python and right now I'm trying to separate the different data into different lists. The data is presented as following:
CODE/ DESCRIPTION/ Unity/ Value1/ Value2/ Value3/ Value4 and then repeat, an example would be:
P03133 Auxiliar helper un 203.02 417.54 437.22 675.80
My approach to it until now has been:
Creating lists to storage each information:
codes = []
description = []
unity = []
cost = []
Through loops finding a code, based on the code's structure, and using the code's index as base to find the remaining values.
Finding a code's easy, it's a distinct type of information amongst the other data.
For the remaining values I made a loop to find the next value that is numeric after a code. That way I can delimitate the rest of the indexes:
The unity would be the code's index + index until isnumeric - 1, hence it's the first information prior to the first numeric value in each line.
The cost would be the code's index + index until isnumeric + 2, the third value is the only one I need to store.
The description is a little harder, the number of elements that compose it varies across the list. So I used slicing starting at code's index + 1 and ending at index until isnumeric - 2.
for i, carc in enumerate(txtl):
if carc[0] == "P" and carc[1].isnumeric():
codes.append(carc)
j = 0
while not txtl[i+j].isnumeric():
j = j + 1
description.append(" ".join(txtl[i+1:i+j-2]))
unity.append(txtl[i+j-1])
cost.append(txtl[i+j])
I'm facing some problems with this approach, although there will always be more elements to the list after a code I'm getting the error:
while not txtl[i+j].isnumeric():
txtl[i+j] list index out of range.
Accepting any solution to debug my code or even new solutions to problem.
OBS: I'm also going to have to do this to a really similar data font, but the code would be just a sequence of 7 numbers, thus harder to find amongst the other data. Any solution that includes this facet is also appreciated!
A slight addition to your code should resolve this:
while i+j < len(txtl) and not txtl[i+j].isnumeric():
j += 1
The first condition fails when out of bounds, so the second one doesn't get checked.
Also, please use a list of dict items instead of 4 different lists, fe:
thelist = []
thelist.append({'codes': 69, 'description': 'random text', 'unity': 'whatever', 'cost': 'your life'})
In this way you always have the correct values together in the list, and you don't need to keep track of where you are with indexes or other black magic...
EDIT after comment interactions:
Ok, so in this case you split the line you are processing on the space character, and then process the words in the line.
from pprint import pprint # just for pretty printing
textl = 'P03133 Auxiliar helper un 203.02 417.54 437.22 675.80'
the_list = []
def handle_line(textl: str):
description = ''
unity = None
values = []
for word in textl.split()[1:]:
# it splits on space characters by default
# you can ignore the first item in the list, as this will always be the code
# str.isnumeric() doesn't work with floats, only integers. See https://stackoverflow.com/a/23639915/9267296
if not word.replace(',', '').replace('.', '').isnumeric():
if len(description) == 0:
description = word
else:
description = f'{description} {word}' # I like f-strings
elif not unity:
# if unity is still None, that means it has not been set yet
unity = word
else:
values.append(word)
return {'code': textl.split()[0], 'description': description, 'unity': unity, 'values': values}
the_list.append(handle_line(textl))
pprint(the_list)
str.isnumeric() doesn't work with floats, only integers. See https://stackoverflow.com/a/23639915/9267296

I'm struggling to remove this error from my function, when I try to test my code with the parameters

I'm struggling to remove this error from my function, when I try to test my code with the parameters listed below in the "the errors I'm getting" list
Can someone please help me I've been trying to fix this error and couldn't solve it?
I have listed below:
The number.txt file that we have to work on for this code
the errors I'm getting
my code
the sample testing
so my question is clear to understand.
Work with the customers.txt file in this question.
12345,Tom,Black,300.00,1998-01-30
23456,Alice,Smith,1200.50,1998-02-20
14567,Jane,White,900.00,1998-07-01
43564,Weilin,Zhao,450.25,1998-01-03
45432,Bina,Mehta,278.95,1998-03-21
the errors I'm getting
Test various parameters: '['customers file variable', '12345']'
ERROR:
Expected: ['12345', 'Tom', 'Black', '300.00', '1998-01-30']
Test various parameters: '['customers file variable', '45432']'
ERROR:
Expected: ['45432', 'Bina', 'Mehta', '278.95', '1998-03-21']
def customer_by_id(fh, id_number):
"""
-------------------------------------------------------
Find the record for a given ID in a sequential file.
Use: result = customer_by_id(fh, id_number)
-------------------------------------------------------
Parameters:
fh - file to search (file handle - already open for reading)
id_number - the id_number to match (str)
Returns:
result - the record with id_number if it exists,
an empty list otherwise (list)
-------------------------------------------------------
"""
num = fh.read().split("\id_number")
if int(id_number) < len(num):
result = int(num[id_number]).strip().split(",")
else:
result = []
print(result)
Sample testing:
Find customer by id_number
Enter an ID: 23456
['23456', 'Alice', 'Smith', '1200.50', '1998-02-20']
----
Find customer by id_number
Enter an ID: 99999
[]
First, in your code, .split("\id_number") is causing a problem. It looks for the string \id_number in the text file, which is nonexistent. Therefore this results in a list with a single element, the latter being the whole text.
And then the if clause does not make sense, to be blunt. The left hand side is, e.g., integer 12345 representing the customer id. The right hand side, on the other hand, is the number of elements of the list, i.e., 1. Syntactically correct, but semantically wrong.
So when you look at this, the LHS is almost always a big number, say 12345, whereas the RHS is 1. This if clause, therefore, is never going to be executed. Instead, else part is reached, resulting in an empty list.
Here's my suggestion without using csv.
def customer_by_id(fh, id_number):
records = f.read().splitlines()
records = [record.split(",") for record in records]
for record in records:
if record[0] == id_number:
return record
return []
with open("customers.txt", "r") as f:
print(customer_by_id(f, "23456"))
At the first line of the function definition, records is a list containing each lines of customers.txt. So it would look like
['23456,Alice,Smith,1200.50,1998-02-20', '14567,Jane,White,900.00,1998-07-01', ...]
Note that each element of the list is just a string.
The second line now converts each string to a list using split(). Now the records is a list of lists, i.e., each element of records is again a list. Now it looks like
[['23456', 'Alice', 'Smith', '1200.50', '1998-02-20'], ['14567', 'Jane', 'White', '900.00', '1998-07-01'], ... ]
Next, the for loop iterates over these sub-lists and compares the first element (of the sub-list) to the id_number. If they match, return the sub-list (ending the function). Otherwise, continue the iteration. When none of them matches to id_number, return [] is reached.
If I understand your question correctly, you are looking to print the list of information:
def customer_by_id(fh, id_number):
with open(fh, "r") as file:
for line in file.readlines():
if line.startswith(str(id_number)):
customer = line.strip().split(",")
if not customer:
customer = []
return customer
while True:
print("Find customer by id_number")
id = input("Enter an ID: ")
print(customer_by_id("customers.txt", id))
print("---")
This will run until you either close the program or use the KeyboardInterrupt.

Comparing a list to a file and then counting every time that element appears in a list and putting it into a dictionary

I have a list of political parties:
x = ['Green', 'Republicans' 'Democrats', 'Independent']
and then I have a file that lists out which district was won by the political party, there are roughly sixty entries. I have some starter code but I don't quite know how to continue on.
def party_winners (political_party, filename):
winning_party = {}
with open (filename,'r') as f:
for line in f:
results=line.split(',')
Basically all I want is to compare x, to every single list in my file, and see if something matches so if in the file Republicans won 50 times my dictionary will say:
winning_party = {'Republicans':50, 'Democrats': 35, 'Independents': 0}
I knew I forgot something my file is a list of
[county, votes, political party, person who ran]
Assuming that results is a list of the winners, in the exact form that they apear in x, you could do something like this:
winning_party = {}
for region in results:
if not region in winning_party:
winning_party[region] = 0
winning_party[region] += 1
This:
Creates the empty dictionary winning_party
Loops through all elements in your array of regions:
Checkings if the item is already in the dictionary and adds it if isn't
Increments the count on the item by 1
Given a list of winners lst, you can use collections.Counter directly:
from collections import Counter
c = Counter(lst)
How you obtain lst depends on the structure of your csv file.

How to put elements from a file into a 3D List? Python

I am trying to figure out how to get elements from a file into a 3D list.
For example, if my people.txt file looked like:
3 4
SallyLee
MallieKim
KateBrown
JohnDoe
TreyGreen
SarahKind
But I ONLY want SallyLee etc in the 3D list without the top numbers.
So far I have coded:
def main():
list = []
peopleFile = open("people.txt")
peopleRead = peopleFile.readlines()
for lines in peopleRead:
list.append([lines])
peopleFile.close()
print(list)
main()
This then prints it WITH the numbers, and not in a 3D list.
An example of what I am trying to do is:
[[[SallyLee],[MallieKim],[KateBrown]],[[JohnDoe],[TreyGreen],[SarahKind]]]
where every third person is "grouped" together.
I am not expecting anyone to code anything for me!
I just hope that someone can lead me into the right direction.
Thank you
First of all, if all you're looking for is strings (not numbers) you can start your for loop off with a condition to pass any element that has numbers. You can do this with the try:/except:.
Next you can use the parameters of the range function to make a list of the indices in which you're interested in. If you want to group by threes, you can have range make a list of the multiples of three (0,3,6,9,...)
Here's my code:
file = open('text.txt','r')
i = 0
names = []
for line in file:
line.split() #This will split each line into a list
try: #This will try to convert the first element of that list into an integer
if int(line[0]): #If it fails it will go to the next line
continue
except:
if not line.strip(): #This will skip empty lines
continue
names.append(line.strip()) #First let's put all of the names into a list
names = [names[i:i+3] for i in range(0,len(names)-1,3)]
print names
Output:
[['SallyLee', 'MallieKim', 'KateBrown'], ['JohnDoe', 'TreyGreen', 'SarahKind']]

'Splitting' List into several Arrays

I'm trying to complete a Project that will show total annual sales from an specific list contained in a .txt file.
The list is formatted this way:
-lastname, firstname (string)
-45.7 (float)
-456.4 (float)
-345.5 (float)
-lastname2, firstname2 (string)
-3354.7 (float)
-54.6 (float)
-56.2 (float)
-lastname3, firstname3 (string)
-76.6 (float)
-34.2 (float)
-48.2 (float)
And so on.... Actually, 7 different "employees" followed by 12 set of "numbers" (months of the year)....but that example should suffice to give an idea of what I'm trying to do.
I need to output this specific information of every "employee"
-Name of employee
-Total Sum (sum of the 12 numbers in the list)
So my logic is taking me to this conclusion, but I don't know where to start:
Create 7 different arrays to store each "employee" data.
With this logic, I need to split the main list into independent arrays so I can work with them.
How can this be achieved? And also, if I don't have a predefined number of employees (but a defined format :: "Name" followed by 12 months of numbers)...how can I achieve this?
I'm sure I can figure once I get an idea how to "split" a list in different sections -- Every 13 lines?
Yes, at every thirteenth line you'd have the information of an employee.
However, instead of using twelve different lists, you can use a dictionary of lists, so that you wouldn't have to worry about the number of employees.
And you can either use a parameter on the number of lines directed to each employee.
You could do the following:
infile = open("file.txt", "rt")
employee = dict()
name = infile.readline().strip()
while name:
employee[name] = list()
for i in xrange(1, 12):
val = float(infile.readline().strip())
employee[name].append(val)
name = infile.readline().strip()
Some ways to access dictionary entries:
for name, months in employee.items():
print name
print months
for name in employee.keys():
print name
print employee[name]
for months in employee.values():
print months
for name, months in (employee.keys(), employee.values()):
print name
print months
The entire process goes as follows:
infile = open("file.txt", "rt")
employee = dict()
name = infile.readline().strip()
while name:
val = 0.0
for i in xrange(1, 12):
val += float(infile.readline().strip())
employee[name] = val
print ">>> Employee:", name, " -- salary:", str(employee[name])
name = infile.readline().strip()
Sorry for being round the bush, somehow (:
Here is option.
Not good, but still brute option.
summed = 0
with open("file.txt", "rt") as f:
print f.readline() # We print first line (first man)
for line in f:
# then we suppose every line is float.
try:
# convert to float
value = float(line.strip())
# add to sum
summed += value
# If it does not convert, then it is next person
except ValueError:
# print sum for previous person
print summed
# print new name
print line
# reset sum
summed = 0
# on end of file there is no errors, so we print lst result
print summed
since you need more flexibility, there is another option:
data = {} # dict: list of all values for person by person name
with open("file.txt", "rt") as f:
data_key = f.readline() # We remember first line (first man)
data[data_key] = [] # empty list of values
for line in f:
# then we suppose every line is float.
try:
# convert to float
value = float(line.strip())
# add to data
data[data_key].append(value)
# If it does not convert, then it is next person
except ValueError:
# next person's name
data_key = line
# new list
data[data_key] = []
Q: let's say that I want to print a '2% bonus' to employees that made more than 7000 in total sales (12 months)
for employee, stats in data.iteritems():
if sum(stats) > 7000:
print employee + " done 7000 in total sales! need 2% bonus"
I would not create 7 different arrays. I would create some sort of data structure to hold all the relevant information for one employee in one data type (this is python, but surely you can create data structures in python as well).
Then, as you process the data for each employee, all you have to do is iterate over one array of employee data elements. That way, it's much easier to keep track of the indices of the data (or maybe even eliminates the need to!).
This is especially helpful if you want to sort the data somehow. That way, you'd only have to sort one array instead of 7.

Categories