Update: Python average income reading and writing files - python

I was writing a code to find the average household income, and how many families are below poverty line.
this is my code so far
def povertyLevel():
inFile = open('program10.txt', 'r')
outFile = open('program10-out.txt', 'w')
outFile.write(str("%12s %12s %15s\n" % ("Account #", "Income", "Members")))
lineRead = inFile.readline() # Read first record
while lineRead != '': # While there are more records
words = lineRead.split() # Split the records into substrings
acctNum = int(words[0]) # Convert first substring to integer
annualIncome = float(words[1]) # Convert second substring to float
members = int(words[2]) # Convert third substring to integer
outFile.write(str("%10d %15.2f %10d\n" % (acctNum, annualIncome, members)))
lineRead = inFile.readline() # Read next record
# Close the file.
inFile.close() # Close file
Call the main function.
povertyLevel()
I am trying to find the average of annualIncome and what i tried to do was
avgIncome = (sum(annualIncome)/len(annualIncome))
outFile.write(avgIncome)
i did this inside the while lineRead. however it gave me an error saying
avgIncome = (sum(annualIncome)/len(annualIncome))
TypeError: 'float' object is not iterable
currently i am trying to find which household that exceeds the average income.

avgIncome expects a sequence (such as a list) (Thanks for the correction, Magenta Nova.), but its argument annualIncome is a float:
annualIncome = float(words[1])
It seems to me you want to build up a list:
allIncomes = []
while lineRead != '':
...
allIncomes.append(annualIncome)
averageInc = avgIncome(allIncomes)
(Note that I have one less indentation level for the avgIncome call.)
Also, once you get this working, I highly recommend a trip over to https://codereview.stackexchange.com/. You could get a lot of feedback on ways to improve this.
Edit:
In light of your edits, my advice still stands. You need to first compute the average before you can do comparisons. Once you have the average, you will need to loop over the data again to compare each income. Note: I advise saving the data somehow for the second loop, instead of reparsing the file. (You may even wish to separate reading the data from computing the average entirely.) That might best be accomplished with a new object or a namedtuple or a dict.

sum() and len() both take as their arguments an iterable. read the python documentation for more on iterables. you are passing a float into them as an argument. what would it mean to get the sum, or the length, of a floating point number? even thinking outside the world of coding, it's hard to make sense of that.
it seems like you need to review the basics of python types.

Related

How to change the blank elements in a csv file to zeroes in Python with strip()?

I'm writing a program that reads from an excel file then writes data to a text file. The function in question is supposed to perform a mathematical equation but I'm getting:
ValueError: could not convert string to float: ''
because the 3 "loans" columns have some empty cells instead of zeroes.
I asked my professor for help and they said I should use strip() and if statements but I'm not sure how I'd implement it. I tried making if statements under the original functions they were in but after I made the return statement, the original return statement in the function obviously became uncallable. I tried other things but had no success. I did look up similarly worded questions but no luck there either.
Here are the two functions that get errors:
def debt_service_ratio(salary, loan1, loan2, loan3):
pay = float(salary)
mortgage = float(loan1)
personal_loans = float(loan2)
student_loans = float(loan3)
t_owed = mortgage + personal_loans + student_loans
return'Total Debt Service Ratio: ' + str(t_owed / pay)
and
debt_service_ratio(row[3], row[5], row[6], row[7])
So I specifically would like to know how you can use the strip function to do this but I also wouldn't mind knowing any other methods.
You need conditions in your function like:
if loan1.strip(): #if stripping all spaces from loan1 gives a non-empty string
mortgage = float(loan1)
else: #else if it is an empty string
mortgage = 0
Do this for all variables that may be empty.

Fixing a meeting room function schedule with double and triple bookings to determine space usage

I need to calculate the total amount of time each group uses a meeting space. But the data set has double and triple booking, so I think I need to fix the data first. Disclosure: My coding experience consists solely of working through a few Dataquest courses, and this is my first stackoverflow posting, so I apologize for errors and transgressions.
Each line of the data set contains the group ID and a start and end time. It also includes the booking type, ie. reserved, meeting, etc. Generally, the staff reserve a space for the entire period, which would create a single line, and then add multiple lines for each individual function when the details are known. They should segment the original reserved line so it's only holding space in between functions, but instead they double book the space, so I need to add multiple lines for these interim RES holds, based on the actual holds.
Here's what the data basically looks like:
Existing data:
functions = [['Function', 'Group', 'FunctionType', 'StartTime', 'EndTime'],
[01,01,'RES',2019/10/04 07:00,2019/10/06 17:00],
[02,01,'MTG',2019/10/05 09:00,2019/10/05 12:00],
[03,01,'LUN',2019/10/05 12:30,2019/10/05 13:30],
[04,01,'MTG',2019/10/05 14:00,2019/10/05 17:00],
[05,01,'MTG',2019/10/06 09:00,2019/10/06 12:00]]
I've tried to iterate using a for loop:
for index, row in enumerate(functions):
last_row_index = len(functions) - 1
if index == last_row_index:
pass
else:
current_index = index
next_index = index + 1
if row[3] <= functions[next_index][2]:
next
elif row[4] == 'RES' or row[6] < functions[next_index][6]:
copied_current_row = row.copy()
row[3] = functions[next_index][2]
copied_current_row[2] = functions[next_index][3]
functions.append(copied_current_row)
There seems to be a logical problem in here, because that last append line seems to put the program into some kind of loop and I have to manually interrupt it. So I'm sure it's obvious to someone experienced, but I'm pretty new.
The reason I've done the comparison to see if a function is RES is that reserved should be subordinate to actual functions. But sometimes there are overlaps between actual functions, so I'll need to create another comparison to decide which one takes precedence, but this is where I'm starting.
How I (think) I want it to end up:
[['Function', 'Group', 'FunctionType', 'StartTime', 'EndTime'],
[01,01,'RES',2019/10/04 07:00,2019/10/05 09:00],
[02,01,'MTG',2019/10/05 09:00,2019/10/05 12:00],
[01,01,'RES',2019/10/05 12:00,2019/10/05 12:30],
[03,01,'LUN',2019/10/05 12:30,2019/10/05 13:30],
[01,01,'RES',2019/10/05 13:30,2019/10/05 14:00],
[04,01,'MTG',2019/10/05 14:00,2019/10/05 17:00],
[01,01,'RES',2019/10/05 14:00,2019/10/06 09:00],
[05,01,'MTG',2019/10/06 09:00,2019/10/06 12:00],
[01,01,'RES',2019/10/06 12:00,2019/10/06 17:00]]
This way, I could do a simple calculation of elapsed time for each function line and add it up to see how much time they had the space booked for.
What I'm looking for here is just some direction I should pursue, and I'm definitely not expecting anyone to do the work for me. For example, am I on the right path here, or would it be better to use pandas and vectorized functions? If I can get the basic direction right, I think I can muddle through the specifics.
Thank-you very much,
AF

Python - program for searching for relevant cells in excel does not work correctly

I've written a code to search for relevant cells in an excel file. However, it does not work as well as I had hoped.
In pseudocode, this is it what it should do:
Ask for input excel file
Ask for input textfile containing keywords to search for
Convert input textfile to list containing keywords
For each keyword in list, scan the excelfile
If the keyword is found within a cell, write it into a new excelfile
Repeat with next word
The code works, but some keywords are not found while they are present within the input excelfile. I think it might have something to do with the way I iterate over the list, since when I provide a single keyword to search for, it works correctly. This is my whole code: https://pastebin.com/euZzN3T3
This is the part I suspect is not working correctly. Splitting the textfile into a list works fine (I think).
#IF TEXTFILE
elif btext == True:
#Split each line of textfile into a list
file = open(txtfile, 'r')
#Keywords in list
for line in file:
keywordlist = file.read().splitlines()
nkeywords = len(keywordlist)
print(keywordlist)
print(nkeywords)
#Iterate over each string in list, look for match in .xlsx file
for i in range(1, nkeywords):
nfound = 0
ws_matches.cell(row = 1, column = i).value = str.lower(keywordlist[i-1])
for j in range(1, worksheet.max_row + 1):
cursor = worksheet.cell(row = j, column = c)
cellcontent = str.lower(cursor.value)
if match(keywordlist[i-1], cellcontent) == True:
ws_matches.cell(row = 2 + nfound, column = i).value = cellcontent
nfound = nfound + 1
and my match() function:
def match(keyword, content):
"""Check if the keyword is present within the cell content, return True if found, else False"""
if content.find(keyword) == -1:
return False
else:
return True
I'm new to Python so my apologies if the way I code looks like a warzone. Can someone help me see what I'm doing wrong (or could be doing better?)? Thank you for taking the time!
Splitting the textfile into a list works fine (I think).
This is something you should actually test (hint: it does but is inelegant). The best way to make easily testable code is to isolate functional units into separate functions, i.e. you could make a function that takes the name of a text file and returns a list of keywords. Then you can easily check if that bit of code works on its own. A more pythonic way to read lines from a file (which is what you do, assuming one word per line) is as follows:
with open(filename) as f:
keywords = f.readlines()
The rest of your code may actually work better than you expect. I'm not able to test it right now (and don't have your spreadsheet to try it on anyway), but if you're relying on nfound to give you an accurate count for all keywords, you've made a small but significant mistake: it's set to zero inside the loop, and thus you only get a count for the last keyword. Move nfound = 0 outside the loop.
In Python, the way to iterate over lists - or just about anything - is not to increment an integer and then use that integer to index the value in the list. Rather loop over the list (or other iterable) itself:
for keyword in keywordlist:
...
As a hint, you shouldn't need nkeywords at all.
I hope this gets you on the right track. When asking questions in future, it'd be a great help to provide more information about what goes wrong, and preferably enough to be able to reproduce the error.

How do I manipulate data in a list that has been read in from a file using Python 2.x?

I am trying to create a program that will tally the cost of ingredients within a recipe and return a total cost for said recipe. I am teaching myself Python and have set this as a personal, but practical, challenge. However, I have hit a wall. Hard.
My idea was to read a file into a list. Multiply the ingredient within the list by the comma separated numeral. Add it all together, and return a single float for the overall cost.
#Phase 1 - MASTER INGREDIENTS LIST
flour_5lb = 2.5
sugar_4lb = 2.0
butter_lb = 3.0
eggs_doz = 3.0
#PHASE 2 - COST PER UNIT CONVERSION
flour_cup = flour_5lb*(1.0/20)
sugar_cup = sugar_4lb*(1.0/8)
butter_Tbsp = butter_lb*(1.0/32)
eggs_each = eggs_doz*(1.0/12)
#PHASE THREE - RECIPE ASSESSMENT
def main():
fileObject = open("filname.txt", "r")
fileLines = fileObject.readlines()
fileObject.close()
for line in fileLines:
print line
print "\n"
if __name__ == "__main__":
main()
The for line in fileLines: statement prints the following:
flour_cup, .5
milk_cup, .4
eggs_each, 3
butter_Tbsp, 3
Press any key to continue . . .
If I understand you correctly, you have to parse your file.
For this you need to know the format in which the ingredients are being stored. Since this program is for your personal use you may just choose the most simple.
So let's assume you have your ingredients in CSV format:
sugar 10g
flour 20g
...
Then you can use pythons buildin function split and iteration to obtain a list of list [['sugar', '10g'], ['flour', '10g'], ...].
Getting the amounts into python floats is a little tricky, since we haave to concern ourselves with the units.
Again - choose a fixed set of units to make your life a little easier.
Then use the in statement or the builtin function which checks if a given string has a certain suffix. (I will leave it to you to find this function.)
Then the hard part is done. Hope I could help without giving too much away.
Part of your difficulty is knowing how to split your input on the comma -- use split(). Another problem is converting the string to a float -- use float().
Your last problem is mapping input strings to values. You could write a function that maps strings to costs:
if item == "milk_cup":
return milk_cup
if item == "flour_cup":
return flour_cup
...
...but the better way (DRY) to do it is to use a dictionary.
In my sample below I've used dict() to make the dictionary as then I don't have to quote every string.
Here's a sample:
#!/usr/bin/python
pricelist = dict(
flour_cup=1.0,
milk_cup=0.4,
)
input = ["flour_cup, 0.5", "milk_cup, 0.4"]
total = 0
for line in input:
item, qty = line.split(",")
item = item.strip()
qty = float(qty)
if item in pricelist:
cost = qty * pricelist[item]
print "%s: %.02f\n" % (item, cost)
total += cost
else:
print "I don't know what '%s' is" % item
print "Total: %.02f" % total

Python - 'str' object has no attribute 'append'

I've searched this error on here, but haven't seen anything that yet matches my situation (disclaimer, I'm still getting used to Python).
import os
os.chdir("C:\Projects\Rio_Grande\SFR_Checking") # set working directory
stressPeriod = 1
segCounter = 1
inFlow = 0
outFlow = 0
with open(r"C:\Projects\streamflow.dat") as inputFile:
inputList = list(inputFile)
while stressPeriod <= 1:
segCounter = 1
lineCounter = 1
outputFile = open("stats.txt", 'w') # Create the output file
for lineItem in inputList:
if (((stressPeriod - 1) * 11328) + 8) < lineCounter <= (stressPeriod * 11328):
lineItem = lineItem.split()
if int(lineItem[3]) == int(segCounter) and int(lineItem[4]) == int(1):
inFlow = lineItem[5]
outFlow = lineItem[7]
lineItemMem = lineItem
elif int(lineItem[3]) == int(segCounter) and int(lineItem[4]) <> int(1):
outFlow = lineItem[7]
else:
gainLoss = str(float(outFlow) - float(inFlow))
lineItemMem.append(gainLoss)
lineItemMem = ','.join(lineItemMem)
outputFile.write(lineItemMem + "\n") # write # lines to file
segCounter += 1
inFlow = lineItem[5]
outFlow = lineItem[7]
lineCounter += 1
outputFile.close()
So basically this program is supposed to read a .dat file and parse out bits of information from it. I split each line of the file into a list to do some math on it (math operations are between varying lines in the file, which adds complexity to the code). I then append a new number to the end of the list for a given line, and that's where things inexplicably break down. I get the following error:
Traceback (most recent call last):
File "C:/Users/Chuck/Desktop/Python/SFR/SFRParser2.py", line 49, in <module>
lineItemMem.append(gainLoss)
AttributeError: 'str' object has no attribute 'append'
When I give it a print command to test that lineItemMem is actually a list and not a string, it prints a list for me. If I put in code for
lineItemMem.split(",") to break the string, I get an error saying that list object has no attribute split. So basically, when I try to do list operations, the error says its a string, and when I try to do string operations, the error says it's a list. I've tried a fair bit of mucking around, but frankly can't tell what the problem is here. Any insight is appreciated, thanks.
I think the issue has to do with these lines:
lineItemMem.append(gainLoss)
lineItemMem = ','.join(lineItemMem)
Initially lineItemMem is a list, and you can append an item to the end of it. However, the join call you're doing turns the list into a string. That means the next time this part of the code runs, the append call will fail.
I'm not certain exactly what the best solution is. Perhaps you should use a different variable for the string version? Or maybe after you join the list items together into a single string and write that result out, you should reinitialize the lineItemMem variable to a new empty list? You'll have to decide what works best for your actual goals.
There are two places where lineItemMem is set. The first is this:
lineItem = lineItem.split()
# ...
lineItemMem = lineItem
where it is set to the result of a split operation, i.e. a List.
The second place is this:
lineItemMem = ','.join(lineItemMem)
here, it is set to the result of a join operation, i.e. a String.
So, the reason why the error sometimes states that it is a string and sometimes a list is, that that is acutally the case depending on the conditions in the if statement.
The code, as presented, is imho near undebuggable. Instead of tinkering, it would be a better approach to think about the different goals that should be achieved (reading a file, parsing the content, formatting the data, writing it to another file) and tackle them individually.

Categories