How to find the sum of values in a column from a text file if matching certain criteria - python

I'm having some trouble trying add certain values in a column from a text file together. My text file looks like:
e320,2/3/5,6661,c120,A,6661
e420,6/5/3,16916,c849,A,24323
e432,6/5/3,6962,c8429,A,4324
e430,6/5/3,4322,c8491,A,4322
e32042,2/3/5,13220,c1120,A,13220
e4202,6/5/3,4232,c8419,E,4232
I would like to find the sum of the last column's values, provided in the array the third column (final total) is equal to the last column. (amount paid.). The total of all the last column's values should only be found if the fifth column's (status) equals 'E' and finaltotal == amountpaid.
My code so far for this is:
data = open("paintingJobs.txt", "r")
info=data.readlines()
data.close
totalrev=0
for li in info:
status=li.split(",")[4]
finaltotal=int(li.split(",")[2])
amountpaid=int(li.split(",")[5])
if amountpaid == finaltotal:
revenue=True
if status == "A" and revenue == True:
totalamountpaid = li.split(",")[5]
total = (sum(totalamountpaid))
print("The total revenue is")
print(total)
My desired output would be:
The total revenue is
28435
The total should equal 28435 as 6661+4322+13220+4232=28435 (the sum of the total revenues where status equals 'A' and finaltotal=amountpaid.)
I keep receiving a "TypeError: unsupported operand type(s) for +: 'int' and 'str'". I'm using Python 3.4.3 and a complete newbie to Python. Any help would be much appreciated.

Try this.
total = (sum(totalamountpaid))
to
total = (sum(map(int,totalamountpaid.split(','))))
Split every number from the string map converting the string to int. Then sum them up.

...assuming that the third column should be equal to 'E':
data = open("test.txt", "r")
info=data.readlines()
s = sum([int(li.split(',')[5]) for li in info if li.split(",")[4]=="E" and int(li.split(",")[2])==int(li.split(",")[5])])
print("The total revenue is")
print(s)
Tested. Returns 24113, i.e. 6661+13220+4232.

You are fetching strings from your text file. That means you first need to cast the values to appropriate data type (from strings) before adding them up.
Try changing this line total = (sum(totalamountpaid)) to total = (sum(Decimal(totalamountpaid))) or total = (sum(float(totalamountpaid)))

Just need to make use of the 'totalrev' variable and add up 'amountpaid' every time the 'for loop' executed, and only adding the numbers determined by your criteria. At the end you just call it in your print statement. I removed two lines of codes you didn't need after the small change.
data = open("paintingJobs.txt", "r")
info=data.readlines()
data.close()
totalrev=0
for li in info:
status=(li.split(",")[4])
finaltotal=int(li.split(",")[2])
amountpaid=int(li.split(",")[5])
if amountpaid == finaltotal:
totalrev += amountpaid
revenue=True
if status == "E" and revenue == True:
print("The total revenue is: " + str(totalrev))
This works with the data you provided, I get 28435 which is what you were looking for

It is because at this line,
total = (sum(totalamountpaid))
the sum function is applied to a string
So using your example data, you are effective asking python to execute this
sum("4322")
which is equivalent to
0 + "4" + "3" + "2" + "2"
Of course you cannot add string to a numeric value 0. Hence the error message.
Actually there are a few many issues with your code. I think you need to make these changes to make it work. See comments (words after #) for explanation. Not tested.
data = open("paintingJobs.txt", "r")
info=data.readlines()
data.close() ## Need the '()' to call the function
totalrev=0
for li in info:
status=li.split(",")[4]
finaltotal=int(li.split(",")[2])
amountpaid=int(li.split(",")[5])
if amountpaid == finaltotal:
revenue=True
if status == "A" and revenue == True:
totalamountpaid = li.split(",")[5]
### Assuming you actually want to accumulate the sum in variable `totalrev`
totalrev += int(totalamountpaid) ### you need to convert totalamountpaid to a numeric value, and add to the running total `totalrev`
print("The total revenue is")
print(totalrev)

Related

How do I make sure all of my values are computed in my loop?

I am working on a 'keep the change assignment' where I round the purchases to the whole dollar and add the change to the savings account. However, the loop is not going through all of the values in my external text file. It only computes the last value. I tried splitting the file but it gives me an error. What might be the issue? my external text file is as so:
10.90
13.59
12.99
(each on different lines)
def main():
account1 = BankAccount()
file1 = open("data.txt","r+") # reading the file, + indicated read and write
s = 0 # to keep track of the new savings
for n in file1:
n = float(n) #lets python know that the values are floats and not a string
z= math.ceil(n) #rounds up to the whole digit
amount = float(z-n) # subtract the rounded sum with actaul total to get change
print(" Saved $",round(amount,2), "on this purchase",file = file1)
s = amount + s
x = (account1.makeSavings(s))
I'm fairly sure the reason for this is because you are printing the amount of money you have saved to the file. In general, you don't want to alter the length of an object you are iterating over because it can cause problems.
account1 = BankAccount()
file1 = open("data.txt","r+") # reading the file, + indicated read and write
s = 0 # to keep track of the new savings
amount_saved = []
for n in file1:
n = float(n) #lets python know that the values are floats and not a string
z= math.ceil(n) #rounds up to the whole digit
amount = float(z-n) # subtract the rounded sum with actaul total to get change
amount_saved.append(round(amount,2))
s = amount + s
x = (account1.makeSavings(s))
for n in amount_saved:
print(" Saved $",round(amount,2), "on this purchase",file = file1)
This will print the amounts you have saved at the end of the file after you are finished iterating through it.

Python - For every value in text file?

With the code I am writing, I have split a text file up using commas, and now for each value in there, I want to make it an integer. I have tried splitting the text file and then turning it into an integer but that would not work. Is there any way of saying for all values in a file, do a certain thing? Also, the amount of values isn't concrete, it depends on the user of the programme (it is a 'shopping list' programme.
My current code:
TotalCOSTS=open("TotalCOSTS.txt","r")
Prices=TotalCOSTS.read()
print(Prices)
Prices.strip().split(",")
IntPrices=int(NewPrices)
print(len(IntPrices))
if len(IntPrices)==1:
print("Your total cost is: "+IntPrices +" Pounds")
elif len(IntPrices)>1:
FinalTotal = sum([int(num) for num in IntPrices.split(",")])
print("Your total cost is: "+ FinalTotal +" Pounds")
Prices is the file that the values are contained in, so I've stripped it of whitespace and then split it. That is where I need to continue on from.
Thank you xx
results = [int(i) for i in results]
python 3 you can do:
results = list(map(int, results))
NewPrices isn't defined in your code example
split() returns a list
The most straight-forward way to accomplish what you are trying to do is the following:
total = sum([int(x) for x in TotalCOSTS.read().split(',') if x.isdigit() == True])
But this makes some super-simplifying assumptions which won't be accurate all of the time. For example, if something costs $2.99, int() will cast this to 3. Overall, you want to consider the price in terms of cents (idk which currency you are using, but in USD, 100 cents = 1 dollar) so that $2.99 = 299 cents.
So really, you want something like this:
total = sum([float(x)*100 for x in TotalCOSTS.read().split(',') if x.isnumeric() == True])/100

How do I manipulate data in a list that has been read in from a file using Python 2.x?

I am trying to create a program that will tally the cost of ingredients within a recipe and return a total cost for said recipe. I am teaching myself Python and have set this as a personal, but practical, challenge. However, I have hit a wall. Hard.
My idea was to read a file into a list. Multiply the ingredient within the list by the comma separated numeral. Add it all together, and return a single float for the overall cost.
#Phase 1 - MASTER INGREDIENTS LIST
flour_5lb = 2.5
sugar_4lb = 2.0
butter_lb = 3.0
eggs_doz = 3.0
#PHASE 2 - COST PER UNIT CONVERSION
flour_cup = flour_5lb*(1.0/20)
sugar_cup = sugar_4lb*(1.0/8)
butter_Tbsp = butter_lb*(1.0/32)
eggs_each = eggs_doz*(1.0/12)
#PHASE THREE - RECIPE ASSESSMENT
def main():
fileObject = open("filname.txt", "r")
fileLines = fileObject.readlines()
fileObject.close()
for line in fileLines:
print line
print "\n"
if __name__ == "__main__":
main()
The for line in fileLines: statement prints the following:
flour_cup, .5
milk_cup, .4
eggs_each, 3
butter_Tbsp, 3
Press any key to continue . . .
If I understand you correctly, you have to parse your file.
For this you need to know the format in which the ingredients are being stored. Since this program is for your personal use you may just choose the most simple.
So let's assume you have your ingredients in CSV format:
sugar 10g
flour 20g
...
Then you can use pythons buildin function split and iteration to obtain a list of list [['sugar', '10g'], ['flour', '10g'], ...].
Getting the amounts into python floats is a little tricky, since we haave to concern ourselves with the units.
Again - choose a fixed set of units to make your life a little easier.
Then use the in statement or the builtin function which checks if a given string has a certain suffix. (I will leave it to you to find this function.)
Then the hard part is done. Hope I could help without giving too much away.
Part of your difficulty is knowing how to split your input on the comma -- use split(). Another problem is converting the string to a float -- use float().
Your last problem is mapping input strings to values. You could write a function that maps strings to costs:
if item == "milk_cup":
return milk_cup
if item == "flour_cup":
return flour_cup
...
...but the better way (DRY) to do it is to use a dictionary.
In my sample below I've used dict() to make the dictionary as then I don't have to quote every string.
Here's a sample:
#!/usr/bin/python
pricelist = dict(
flour_cup=1.0,
milk_cup=0.4,
)
input = ["flour_cup, 0.5", "milk_cup, 0.4"]
total = 0
for line in input:
item, qty = line.split(",")
item = item.strip()
qty = float(qty)
if item in pricelist:
cost = qty * pricelist[item]
print "%s: %.02f\n" % (item, cost)
total += cost
else:
print "I don't know what '%s' is" % item
print "Total: %.02f" % total

Python *.count is returning a letter instead of a number

I am attempting to count the number of times the letter C appears in a list. When I use:
count = data[data.count('C')]
print ("There are", count, "molecules in the file")
when the code is run, it returns There are . molecules in the file
If I type data.count('C') after the program has run, it returns the correct value (43). I can't figure out what I am doing wrong.
Could this line have something to do with it, maybe? ;)
count = data[data.count('C')] # This gives you the value at index data.count('C') of data
The actual count, as you later put it, is:
count = data.count('C')
Try replacing the 1st line with:
count = data.count('C')
Modify the first line:
count = data.count('C')
The problem is that you were printing the n'th element of the list data (where n=count) instead of the count itself.
As a side note, this is a better way to print your result:
print "There are {0} molecules in the file".format(count)
You are using data twice ....
count = data[data.count('C')]
should be
count =data.count('C')
This would print
There are 43 molecules in the file
The good news is that you're getting the correct value from the method.
The bad news is that you're using it incorrectly.
You're using the result as an index into the string, which then results in a character from the string. Stop doing that.

'Splitting' List into several Arrays

I'm trying to complete a Project that will show total annual sales from an specific list contained in a .txt file.
The list is formatted this way:
-lastname, firstname (string)
-45.7 (float)
-456.4 (float)
-345.5 (float)
-lastname2, firstname2 (string)
-3354.7 (float)
-54.6 (float)
-56.2 (float)
-lastname3, firstname3 (string)
-76.6 (float)
-34.2 (float)
-48.2 (float)
And so on.... Actually, 7 different "employees" followed by 12 set of "numbers" (months of the year)....but that example should suffice to give an idea of what I'm trying to do.
I need to output this specific information of every "employee"
-Name of employee
-Total Sum (sum of the 12 numbers in the list)
So my logic is taking me to this conclusion, but I don't know where to start:
Create 7 different arrays to store each "employee" data.
With this logic, I need to split the main list into independent arrays so I can work with them.
How can this be achieved? And also, if I don't have a predefined number of employees (but a defined format :: "Name" followed by 12 months of numbers)...how can I achieve this?
I'm sure I can figure once I get an idea how to "split" a list in different sections -- Every 13 lines?
Yes, at every thirteenth line you'd have the information of an employee.
However, instead of using twelve different lists, you can use a dictionary of lists, so that you wouldn't have to worry about the number of employees.
And you can either use a parameter on the number of lines directed to each employee.
You could do the following:
infile = open("file.txt", "rt")
employee = dict()
name = infile.readline().strip()
while name:
employee[name] = list()
for i in xrange(1, 12):
val = float(infile.readline().strip())
employee[name].append(val)
name = infile.readline().strip()
Some ways to access dictionary entries:
for name, months in employee.items():
print name
print months
for name in employee.keys():
print name
print employee[name]
for months in employee.values():
print months
for name, months in (employee.keys(), employee.values()):
print name
print months
The entire process goes as follows:
infile = open("file.txt", "rt")
employee = dict()
name = infile.readline().strip()
while name:
val = 0.0
for i in xrange(1, 12):
val += float(infile.readline().strip())
employee[name] = val
print ">>> Employee:", name, " -- salary:", str(employee[name])
name = infile.readline().strip()
Sorry for being round the bush, somehow (:
Here is option.
Not good, but still brute option.
summed = 0
with open("file.txt", "rt") as f:
print f.readline() # We print first line (first man)
for line in f:
# then we suppose every line is float.
try:
# convert to float
value = float(line.strip())
# add to sum
summed += value
# If it does not convert, then it is next person
except ValueError:
# print sum for previous person
print summed
# print new name
print line
# reset sum
summed = 0
# on end of file there is no errors, so we print lst result
print summed
since you need more flexibility, there is another option:
data = {} # dict: list of all values for person by person name
with open("file.txt", "rt") as f:
data_key = f.readline() # We remember first line (first man)
data[data_key] = [] # empty list of values
for line in f:
# then we suppose every line is float.
try:
# convert to float
value = float(line.strip())
# add to data
data[data_key].append(value)
# If it does not convert, then it is next person
except ValueError:
# next person's name
data_key = line
# new list
data[data_key] = []
Q: let's say that I want to print a '2% bonus' to employees that made more than 7000 in total sales (12 months)
for employee, stats in data.iteritems():
if sum(stats) > 7000:
print employee + " done 7000 in total sales! need 2% bonus"
I would not create 7 different arrays. I would create some sort of data structure to hold all the relevant information for one employee in one data type (this is python, but surely you can create data structures in python as well).
Then, as you process the data for each employee, all you have to do is iterate over one array of employee data elements. That way, it's much easier to keep track of the indices of the data (or maybe even eliminates the need to!).
This is especially helpful if you want to sort the data somehow. That way, you'd only have to sort one array instead of 7.

Categories