'Splitting' List into several Arrays - python

I'm trying to complete a Project that will show total annual sales from an specific list contained in a .txt file.
The list is formatted this way:
-lastname, firstname (string)
-45.7 (float)
-456.4 (float)
-345.5 (float)
-lastname2, firstname2 (string)
-3354.7 (float)
-54.6 (float)
-56.2 (float)
-lastname3, firstname3 (string)
-76.6 (float)
-34.2 (float)
-48.2 (float)
And so on.... Actually, 7 different "employees" followed by 12 set of "numbers" (months of the year)....but that example should suffice to give an idea of what I'm trying to do.
I need to output this specific information of every "employee"
-Name of employee
-Total Sum (sum of the 12 numbers in the list)
So my logic is taking me to this conclusion, but I don't know where to start:
Create 7 different arrays to store each "employee" data.
With this logic, I need to split the main list into independent arrays so I can work with them.
How can this be achieved? And also, if I don't have a predefined number of employees (but a defined format :: "Name" followed by 12 months of numbers)...how can I achieve this?
I'm sure I can figure once I get an idea how to "split" a list in different sections -- Every 13 lines?

Yes, at every thirteenth line you'd have the information of an employee.
However, instead of using twelve different lists, you can use a dictionary of lists, so that you wouldn't have to worry about the number of employees.
And you can either use a parameter on the number of lines directed to each employee.
You could do the following:
infile = open("file.txt", "rt")
employee = dict()
name = infile.readline().strip()
while name:
employee[name] = list()
for i in xrange(1, 12):
val = float(infile.readline().strip())
employee[name].append(val)
name = infile.readline().strip()
Some ways to access dictionary entries:
for name, months in employee.items():
print name
print months
for name in employee.keys():
print name
print employee[name]
for months in employee.values():
print months
for name, months in (employee.keys(), employee.values()):
print name
print months
The entire process goes as follows:
infile = open("file.txt", "rt")
employee = dict()
name = infile.readline().strip()
while name:
val = 0.0
for i in xrange(1, 12):
val += float(infile.readline().strip())
employee[name] = val
print ">>> Employee:", name, " -- salary:", str(employee[name])
name = infile.readline().strip()
Sorry for being round the bush, somehow (:

Here is option.
Not good, but still brute option.
summed = 0
with open("file.txt", "rt") as f:
print f.readline() # We print first line (first man)
for line in f:
# then we suppose every line is float.
try:
# convert to float
value = float(line.strip())
# add to sum
summed += value
# If it does not convert, then it is next person
except ValueError:
# print sum for previous person
print summed
# print new name
print line
# reset sum
summed = 0
# on end of file there is no errors, so we print lst result
print summed
since you need more flexibility, there is another option:
data = {} # dict: list of all values for person by person name
with open("file.txt", "rt") as f:
data_key = f.readline() # We remember first line (first man)
data[data_key] = [] # empty list of values
for line in f:
# then we suppose every line is float.
try:
# convert to float
value = float(line.strip())
# add to data
data[data_key].append(value)
# If it does not convert, then it is next person
except ValueError:
# next person's name
data_key = line
# new list
data[data_key] = []
Q: let's say that I want to print a '2% bonus' to employees that made more than 7000 in total sales (12 months)
for employee, stats in data.iteritems():
if sum(stats) > 7000:
print employee + " done 7000 in total sales! need 2% bonus"

I would not create 7 different arrays. I would create some sort of data structure to hold all the relevant information for one employee in one data type (this is python, but surely you can create data structures in python as well).
Then, as you process the data for each employee, all you have to do is iterate over one array of employee data elements. That way, it's much easier to keep track of the indices of the data (or maybe even eliminates the need to!).
This is especially helpful if you want to sort the data somehow. That way, you'd only have to sort one array instead of 7.

Related

how to export selected dictionary data into a file in python?

Right now when I check this function, in export_incomes
if income_types in expenses:
TypeError: unhashable type: 'list'
I am not sure what I did wrong here.
I did create an incomes dictionary with income_types as key and income_value as value.
Iterates over the given incomes dictionary, filters based on the given income types (a list of strings), and exports to a file.
Exports the given income types from the given incomes dictionary to the given file.
def export_incomes(incomes, income_types, file):
final_list = []
for u, p in expenses.items():
if expense_types in expenses:
final_list.append(':'.join([u,p]) + '\n')
fout = open(file, 'w')
fout.writelines(final_list)
fout.close()
If this is the income list that should be in txt if a user inputs stock, estate, work and investment, each should item and value should be on a separate line:
stock: 10000
estate: 2000
work: 80000
investment: 30000
First, you start your question with expenses but ends with incomes and the code also has incomes in the parameters so I'll go with the income.
Second, the error says the answer. "expense_types(income_types)" is a list and "expenses(incomes)" is a dictionary. You're trying to find a list (not hashable) in a dictionary.
So to make your code work:
def export_incomes(incomes, income_types, file):
items_to_export = []
for u, p in incomes.items():
if u in income_types:
items_to_export.append(': '.join([u,p]) + '\n') # whitespace after ':' for spacing
with open(file, 'w') as f:
f.writelines(items_to_export)
If I made any wrong assumption or got your intention wrong, pls let me know.

How to not hardcode function in this example

The following links contain 2 csv files that the function should pass through grades_1e_2a
grades_2e_4a
However my function is only able to pass the 2nd linked file, as it is hardcoded to range(4,8).
output: [91.5, 73.5, 81.5, 91.5]
The input file will start at the 4th element but may not necessarily end at the 8th element.
def class_avg(open_file):
'''(file) -> list of float
Return a list of assignment averages for the entire class given the open
class file. The returned list should contain assignment averages in the
order listed in the given file. For example, if there are 3 assignments
per student, the returned list should 3 floats representing the 3 averages.
'''
marks=[[],[],[],[]]
avgs = []
for line in open_file:
grades_list = line.strip().split(',')
for idx,i in enumerate(range(4,8)):
marks[idx].append(float(grades_list[i]))
for mark in marks:
avgs.append(float(sum(mark)/(len(mark))))
return avgs
How do I fix this so that my code will be able to read both files, or any file?
I have already opened the file and iterated past the first line with file.readline() in a previous function on the same file.
Thanks for everyone's help in advance.
Updated progress: https://gyazo.com/064dd0d695e3a3e1b4259a25d1b0b1a0
As both sets of your data start the same place the following works
for idx,i in enumerate(range(4,len(grades_list))):
This should fulfill all requirements that Im aware of up to this point
def class_avg(open_file):
'''(file) -> list of float
Return a list of assignment averages for the entire class given the open
class file. The returned list should contain assignment averages in the
order listed in the given file. For example, if there are 3 assignments
per student, the returned list should 3 floats representing the 3 averages.
'''
marks = None
avgs = []
for line in open_file:
grades_list = line.strip().split(',')
if marks is None:
marks = []
for i in range(len(grades_list) -4):
marks.append([])
for idx,i in enumerate(range(4,len(grades_list))):
marks[idx].append(int(grades_list[i]))
for mark in marks:
avgs.append(float(sum(mark)/(len(mark))))
return avgs
Try like this:
def class_avg(open_file, start = 4, end = 8):
...
...
for idx,i in enumerate(range(start, end)):
Try using this:
for idx,i in enumerate(range(4,len(grades_list))):
marks[idx].append(int(grades_list[i]))
Considering you know how many assignments are there and initialized the marks list accordingly.

How do I manipulate data in a list that has been read in from a file using Python 2.x?

I am trying to create a program that will tally the cost of ingredients within a recipe and return a total cost for said recipe. I am teaching myself Python and have set this as a personal, but practical, challenge. However, I have hit a wall. Hard.
My idea was to read a file into a list. Multiply the ingredient within the list by the comma separated numeral. Add it all together, and return a single float for the overall cost.
#Phase 1 - MASTER INGREDIENTS LIST
flour_5lb = 2.5
sugar_4lb = 2.0
butter_lb = 3.0
eggs_doz = 3.0
#PHASE 2 - COST PER UNIT CONVERSION
flour_cup = flour_5lb*(1.0/20)
sugar_cup = sugar_4lb*(1.0/8)
butter_Tbsp = butter_lb*(1.0/32)
eggs_each = eggs_doz*(1.0/12)
#PHASE THREE - RECIPE ASSESSMENT
def main():
fileObject = open("filname.txt", "r")
fileLines = fileObject.readlines()
fileObject.close()
for line in fileLines:
print line
print "\n"
if __name__ == "__main__":
main()
The for line in fileLines: statement prints the following:
flour_cup, .5
milk_cup, .4
eggs_each, 3
butter_Tbsp, 3
Press any key to continue . . .
If I understand you correctly, you have to parse your file.
For this you need to know the format in which the ingredients are being stored. Since this program is for your personal use you may just choose the most simple.
So let's assume you have your ingredients in CSV format:
sugar 10g
flour 20g
...
Then you can use pythons buildin function split and iteration to obtain a list of list [['sugar', '10g'], ['flour', '10g'], ...].
Getting the amounts into python floats is a little tricky, since we haave to concern ourselves with the units.
Again - choose a fixed set of units to make your life a little easier.
Then use the in statement or the builtin function which checks if a given string has a certain suffix. (I will leave it to you to find this function.)
Then the hard part is done. Hope I could help without giving too much away.
Part of your difficulty is knowing how to split your input on the comma -- use split(). Another problem is converting the string to a float -- use float().
Your last problem is mapping input strings to values. You could write a function that maps strings to costs:
if item == "milk_cup":
return milk_cup
if item == "flour_cup":
return flour_cup
...
...but the better way (DRY) to do it is to use a dictionary.
In my sample below I've used dict() to make the dictionary as then I don't have to quote every string.
Here's a sample:
#!/usr/bin/python
pricelist = dict(
flour_cup=1.0,
milk_cup=0.4,
)
input = ["flour_cup, 0.5", "milk_cup, 0.4"]
total = 0
for line in input:
item, qty = line.split(",")
item = item.strip()
qty = float(qty)
if item in pricelist:
cost = qty * pricelist[item]
print "%s: %.02f\n" % (item, cost)
total += cost
else:
print "I don't know what '%s' is" % item
print "Total: %.02f" % total

Organizing and printing information by a specific row in a csv file

I wrote a code that takes in some data, and I end up with a csv file that looks like the following:
1,Steak,Martins
2,Fish,Martins
2,Steak,Johnsons
4,Veggie,Smiths
3,Chicken,Johnsons
1,Veggie,Johnsons
where the first column is a quantity, the second column is the type of item (in this case the meal), and the third column is an identifier (in this case it is family name). I need to print this information to a text file in a specific way:
Martins
1 Steak
2 Fish
Johnsons
2 Steak
3 Chicken
1 Veggie
Smiths
4 Veggie
So What I want is the family name followed by what that family ordered. I wrote the following code to accomplish this, but it doesn't seem to be quite there.
import csv
orders = "orders.txt"
messy_orders = "needs_sorting.csv"
with open(messy_orders, 'rb') as orders_for_sorting, open(orders, 'a') as final_orders_file:
comp = []
reader_sorting = csv.reader(orders_for_sorting)
for row in reader_sorting:
test_bit = [row[2]]
if test_bit not in comp:
comp.append(test_bit)
final_orders_file.write(row[2])
for row in reader_sorting:
if [row[2]] == test_bit:
final_orders_file.write(row[0], row[1])
else:
print "already here"
continue
What I end up with is the following
Martins
2 Fish
Additionally, I never see it print "already here" though I think I should if it were working properly. What I suspect is happening is that the program goes through the second for loop, then exits the program without continuing the first loop. Unfortunately I'm not sure how to make it go back to the original loop once it has identified and printed all instances of a given family name in a file. I thought The reason I have it set up this way, is so that I can get the family name written as a header. Otherwise I would just sort the file by family name. Please note that after running the orders through my first program, I did manage to sort everything such that each row represents the complete quantity of that type of food for that family (there are no recurring instances of a row containing both Steak and Martins).
This is a problem that you solve with a dictionary; which will accumulate your items by the last name (family name) of your file.
The second thing you have to do is accumulate a total of each type of meal - keeping in mind that the data you are reading is a string, and not an integer that you can add, so you'll have to do some conversion.
To put all that together, try this snippet:
import csv
d = dict()
with open(r'd:/file.csv') as f:
reader = csv.reader(f)
for row in reader:
# if the family name doesn't
# exist in our dictionary,
# set it with a default value of a blank dictionary
if row[2] not in d:
d[row[2]] = dict()
# If the meal type doesn't exist for this
# family, set it up as a key in their dictionary
# and set the value to int value of the count
if row[1] not in d[row[2]]:
d[row[2]][row[1]] = int(row[0])
else:
# Both the family and the meal already
# exist in the dictionary, so just add the
# count to the total
d[row[2]][row[1]] += int(row[0])
Once you run through that loop, d looks like this:
{'Johnsons': {'Chicken': 3, 'Steak': 2, 'Veggie': 1},
'Martins': {'Fish': 2, 'Steak': 1},
'Smiths': {'Veggie': 4}}
Now its just a matter of printing it out:
for family,data in d.iteritems():
print('{}'.format(family))
for meal, total in data.iteritems():
print('{} {}'.format(total, meal))
At the end of the loop, you'll have:
Johnsons
3 Chicken
2 Steak
1 Veggie
Smiths
4 Veggie
Martins
2 Fish
1 Steak
You can later improve this snippet by using defaultdict
First time replier so here's a go. Have you considered keeping track of the orders and then writing to a file? I tried something using a dict based approach and it seems to work fine. The idea was to index by the family name and store a list of pairs containing the order quantities and types.
You may also want to consider the readability of your code - it's hard to follow and debug. However, what I think is happening is the line
for line in reader_sorting:
Iterates through reader_sorting. You read the 1st name, extract the family name, and later proceed to iterate again in reader_sorting. This time you start at the 2nd line, which family name matches, and you print it successfully. The rest of the line don't match, but you still iterate through them all. Now you've finished iterating through reader_sorting, and the loop finishes, even though in the outer loop you've read only one line.
One solution may be to create another iterator in the outer for loop and not expend the iterator that loop goes through. However, then you still need to deal with the possibility of double counting, or keeping track of indices. Another way may be to keep of the orders by family as you iterate.
import csv
orders = {}
with open('needs_sorting.csv') as file:
needs_sorting = csv.reader(file)
for amount, meal, family in needs_sorting:
if family not in orders:
orders[family] = []
orders[family].append((amount, meal))
with open('orders.txt', 'a') as file:
for family in orders:
file.write('%s\n' % family)
for amount, meal in orders[family]:
file.write('%s %s\n' % (amount, meal))

How to print only the highest value for a key?

I run into a problem when attempting to solve this task so I'm here after failing a few times, I was wondering how could I only print the highest value(score) for a key (name) when a key stores multipile values such as:
Rob Scored: 3,5,6,2,8
Martin Scored: 4,3,1,5,6,2
Tom Scored: 7,2,8
The name being the key and the scores being the values. Now I wish to get an output of
Martin Scored: 6
Rob Scored: 8
Tom Scored: 8
However when I attempted the max function it would ignore the alphabetical order. Just as a side not that is a requirement as well as the fact that the other scores must be kept stored for later stages.
from collections import OrderedDict
dictionary = {}
for line in f:
firstpart, secondpart = line.strip().split(':')
dictionary[firstpart.strip()] = secondpart.strip()
columns = line.split(": ")
letters = columns[0]
numbers = columns[1].strip()
if d.get(letters):
d[letters].append(numbers)
else:
d[letters] = list(numbers)
sorted_dict = OrderedDict(
sorted((key, list(sorted(vals, reverse=True)))
for key, vals in d.items()))
print (sorted_dict)
This does what you want:
# You don't need to use an OrderedDict if you only want to display in
# sorted order once
score_dict = {} # try using a more descriptive variable name
with open('score_file.txt') as infile:
for line in infile:
name_field, scores = line.split(':') # split the line
name = name_field.split()[0] # split the name field and keep
# just the name
# grab all the scores, strip off whitespace, convert to int
scores = [int(score.strip()) for score in scores.split(',')]
# store name and scores in dictionary
score_dict[name] = scores
# if names can appear multiple times in the input file,
# use this instead of your current if statement:
#
# score_dict.setdefault(name, []).extend(scores)
# now we sort the dictionary keys alphabetically and print the corresponding
# values
for name in sorted(score_dict.keys()):
print("{} Scored: {}".format(name, max(score_dict[name])))
Please give this document a read: Code Like a Pythonista. It has a lot of suggestions for how to write better code, and it is where I learned the dict.setdefault() method for dealing with dictionaries where the values are lists.
On another note, in your question, you referred to an attempt to use the max function, but that function isn't anywhere in the code you provided. If you refer to failed attempts to accomplish something in your question, you should include the failed code as well so we can help you debug it. I was able to give you some code that accomplishes your task along with some other suggestions, but I can't debug your original code if you don't provide it. Since this is obviously a homework question, you should definitely spend some time figuring out why it didn't work in the first place.

Categories