How to print only the highest value for a key?

How to print only the highest value for a key? - python

I run into a problem when attempting to solve this task so I'm here after failing a few times, I was wondering how could I only print the highest value(score) for a key (name) when a key stores multipile values such as:
Rob Scored: 3,5,6,2,8
Martin Scored: 4,3,1,5,6,2
Tom Scored: 7,2,8
The name being the key and the scores being the values. Now I wish to get an output of
Martin Scored: 6
Rob Scored: 8
Tom Scored: 8
However when I attempted the max function it would ignore the alphabetical order. Just as a side not that is a requirement as well as the fact that the other scores must be kept stored for later stages.
from collections import OrderedDict
dictionary = {}
for line in f:
firstpart, secondpart = line.strip().split(':')
dictionary[firstpart.strip()] = secondpart.strip()
columns = line.split(": ")
letters = columns[0]
numbers = columns[1].strip()
if d.get(letters):
d[letters].append(numbers)
else:
d[letters] = list(numbers)
sorted_dict = OrderedDict(
sorted((key, list(sorted(vals, reverse=True)))
for key, vals in d.items()))
print (sorted_dict)

This does what you want:
# You don't need to use an OrderedDict if you only want to display in
# sorted order once
score_dict = {} # try using a more descriptive variable name
with open('score_file.txt') as infile:
for line in infile:
name_field, scores = line.split(':') # split the line
name = name_field.split()[0] # split the name field and keep
# just the name
# grab all the scores, strip off whitespace, convert to int
scores = [int(score.strip()) for score in scores.split(',')]
# store name and scores in dictionary
score_dict[name] = scores
# if names can appear multiple times in the input file,
# use this instead of your current if statement:
#
# score_dict.setdefault(name, []).extend(scores)
# now we sort the dictionary keys alphabetically and print the corresponding
# values
for name in sorted(score_dict.keys()):
print("{} Scored: {}".format(name, max(score_dict[name])))
Please give this document a read: Code Like a Pythonista. It has a lot of suggestions for how to write better code, and it is where I learned the dict.setdefault() method for dealing with dictionaries where the values are lists.
On another note, in your question, you referred to an attempt to use the max function, but that function isn't anywhere in the code you provided. If you refer to failed attempts to accomplish something in your question, you should include the failed code as well so we can help you debug it. I was able to give you some code that accomplishes your task along with some other suggestions, but I can't debug your original code if you don't provide it. Since this is obviously a homework question, you should definitely spend some time figuring out why it didn't work in the first place.

Related

Python mydict( Trying to read values from mydict)

I am trying to read values that were stored in mydict. I keep getting invalid response when running the program. The Excel sheet is formatted by year, location, id, power, and isload. My goal is to print the information associated all the information based off the year and location number.
data = list(csv.reader(open(LOAD_GEN_DATAFILE)))
# read the entire CSV into Python.
# assume CSV has columns as described in the doc string
keyinput=input("Select Year of Study: ")
year=keyinput
mydict={"locA":1,"locb":2}
keyinput2=input(" Select the number associated to the TLA Pocket Location:")
if keyinput2 in mydict:
location=keyinput2
else:
print("Invalid Number")
for year,location,bus,change,isload in data:
# convert the types from string to Python numbers
change= float(change)
bus = int(bus)
if isload.isdigit() and int(isload):
print()
else:
exit

I suspect that you are putting in a number (example 2) and expecting the dictionary to tell you if that value is in the dict or not. However the dictionaries "in" operator works on keys, not values.
Consider this:
mydict = {"a" : 1, "b": 2}
print("Is 1 in mydict? ", 1 in mydict)
print("Is a in mydict? ", "a" in mydict)
print("Is 1 in mydicts values? ", 1 in mydict.values())
Output:
Is 1 in mydict? False
Is a in mydict? True
Is 1 in mydicts values? True
Keep in mind that asking whether a value is in a dict or not is a O(n) operation, the program may have to look at every value in order to determine if that value is in the dictionary or not. However asking if a key is in a dictionary is O(1) (very fast).
If you are always looking up the location based on the number, consider switching the keys/values of your dict around like this:
mydict={1:"locA",2:"locb"}
I don't know much about reading from an excel document or about the nature of the data you have, but assuming your for loop works, here is how you might load the values into the dict:
data = list(csv.reader(open(LOAD_GEN_DATAFILE)))
mydict = {}
for row in data:
year,location,bus,change,isload = row[0:5]
# convert the types from string to Python numbers
change= float(change)
bus = int(bus)
# If this is a year not seen before, add it to the dictionary
if year not in mydict:
mydict[year] = {}
busses_in_year = mydict[year]
if location not in busses_in_year:
busses_in_year[location] = []
# Add the bus to the list of busses that stop at this location
busses_in_year[location].append((bus, change, isload))
# assume CSV has columns as described in the doc string
year = input("Select Year of Study: ")
location = input(" Select the number associated to the TLA Pocket Location:")
if year in mydict and location in mydict[year]:
busses_in_year = mydict[year]
print("Here are all the busses at that location for that year: ")
for bus in busses_in_year[location]:
print(bus)
else:
print("Invalid Year or Location")
Since you are using two keys to group/access the data, you can use nested dictionaries to solve this. Note the syntax to put things into and access things from a dictionary is dict_name[key] = value and value = dict_name[key] respectively.

How to compare the values of a dictionary with unknown keys? [duplicate]

This question already has answers here:
Getting key with maximum value in dictionary?
(29 answers)
Closed 6 years ago.
I am a beginner in Python. I have written a code in which the name of the contestants and their scores will be stored in a dictionary. Let me call the dictionary as results. However I have left it empty while writing the code. The keys and values are going to be added to the dictionary when the program is being run.
results={}
name=raw_input()
#some lines of code to get the score#
results[name]=score
#code#
name=raw_input()
#some lines of code to get the score#
results[name]=score
After the execution of the program, let us say results == {"john":22, "max":20}
I want to compare the scores of John and Max, and declare the person with highest score as a winner. But I wont be knowing the names of the contestants at the beginning of the program. So how can I compare the scores, and declare one of them as the winner.

You can just do this, to get the winner:
max(results, key=results.get)

Here's a working example to achieve what you want, which is basically getting the maximum item from a dictionary. In the example you'll also see other gems like generating deterministic random values instead inserting them manually and getting the min value, here you go:
import random
import operator
results = {}
names = ["Abigail", "Douglas", "Henry", "John", "Quincy", "Samuel",
"Scott", "Jane", "Joseph", "Theodor", "Alfred", "Aeschylus"]
random.seed(1)
for name in names:
results[name] = 18 + int(random.random() * 60)
sorted_results = sorted(results.items(), key=operator.itemgetter(1))
print "This is your input", results
print "This is your sorted input", sorted_results
print "The oldest guy is", sorted_results[-1]
print "The youngest guy is", sorted_results[0]

you can do:
import operator
stats = {'john':22, 'max':20}
maxKey = max(stats.items(), key=operator.itemgetter(1))[0]
print(maxKey,stats[maxKey])
you can also get the max tuple as a whole this way:
maxTuple = max(stats.items(), key=lambda x: x[1])
Hope it helps!

Organizing and printing information by a specific row in a csv file

I wrote a code that takes in some data, and I end up with a csv file that looks like the following:
1,Steak,Martins
2,Fish,Martins
2,Steak,Johnsons
4,Veggie,Smiths
3,Chicken,Johnsons
1,Veggie,Johnsons
where the first column is a quantity, the second column is the type of item (in this case the meal), and the third column is an identifier (in this case it is family name). I need to print this information to a text file in a specific way:
Martins
1 Steak
2 Fish
Johnsons
2 Steak
3 Chicken
1 Veggie
Smiths
4 Veggie
So What I want is the family name followed by what that family ordered. I wrote the following code to accomplish this, but it doesn't seem to be quite there.
import csv
orders = "orders.txt"
messy_orders = "needs_sorting.csv"
with open(messy_orders, 'rb') as orders_for_sorting, open(orders, 'a') as final_orders_file:
comp = []
reader_sorting = csv.reader(orders_for_sorting)
for row in reader_sorting:
test_bit = [row[2]]
if test_bit not in comp:
comp.append(test_bit)
final_orders_file.write(row[2])
for row in reader_sorting:
if [row[2]] == test_bit:
final_orders_file.write(row[0], row[1])
else:
print "already here"
continue
What I end up with is the following
Martins
2 Fish
Additionally, I never see it print "already here" though I think I should if it were working properly. What I suspect is happening is that the program goes through the second for loop, then exits the program without continuing the first loop. Unfortunately I'm not sure how to make it go back to the original loop once it has identified and printed all instances of a given family name in a file. I thought The reason I have it set up this way, is so that I can get the family name written as a header. Otherwise I would just sort the file by family name. Please note that after running the orders through my first program, I did manage to sort everything such that each row represents the complete quantity of that type of food for that family (there are no recurring instances of a row containing both Steak and Martins).

This is a problem that you solve with a dictionary; which will accumulate your items by the last name (family name) of your file.
The second thing you have to do is accumulate a total of each type of meal - keeping in mind that the data you are reading is a string, and not an integer that you can add, so you'll have to do some conversion.
To put all that together, try this snippet:
import csv
d = dict()
with open(r'd:/file.csv') as f:
reader = csv.reader(f)
for row in reader:
# if the family name doesn't
# exist in our dictionary,
# set it with a default value of a blank dictionary
if row[2] not in d:
d[row[2]] = dict()
# If the meal type doesn't exist for this
# family, set it up as a key in their dictionary
# and set the value to int value of the count
if row[1] not in d[row[2]]:
d[row[2]][row[1]] = int(row[0])
else:
# Both the family and the meal already
# exist in the dictionary, so just add the
# count to the total
d[row[2]][row[1]] += int(row[0])
Once you run through that loop, d looks like this:
{'Johnsons': {'Chicken': 3, 'Steak': 2, 'Veggie': 1},
'Martins': {'Fish': 2, 'Steak': 1},
'Smiths': {'Veggie': 4}}
Now its just a matter of printing it out:
for family,data in d.iteritems():
print('{}'.format(family))
for meal, total in data.iteritems():
print('{} {}'.format(total, meal))
At the end of the loop, you'll have:
Johnsons
3 Chicken
2 Steak
1 Veggie
Smiths
4 Veggie
Martins
2 Fish
1 Steak
You can later improve this snippet by using defaultdict

First time replier so here's a go. Have you considered keeping track of the orders and then writing to a file? I tried something using a dict based approach and it seems to work fine. The idea was to index by the family name and store a list of pairs containing the order quantities and types.
You may also want to consider the readability of your code - it's hard to follow and debug. However, what I think is happening is the line
for line in reader_sorting:
Iterates through reader_sorting. You read the 1st name, extract the family name, and later proceed to iterate again in reader_sorting. This time you start at the 2nd line, which family name matches, and you print it successfully. The rest of the line don't match, but you still iterate through them all. Now you've finished iterating through reader_sorting, and the loop finishes, even though in the outer loop you've read only one line.
One solution may be to create another iterator in the outer for loop and not expend the iterator that loop goes through. However, then you still need to deal with the possibility of double counting, or keeping track of indices. Another way may be to keep of the orders by family as you iterate.
import csv
orders = {}
with open('needs_sorting.csv') as file:
needs_sorting = csv.reader(file)
for amount, meal, family in needs_sorting:
if family not in orders:
orders[family] = []
orders[family].append((amount, meal))
with open('orders.txt', 'a') as file:
for family in orders:
file.write('%s\n' % family)
for amount, meal in orders[family]:
file.write('%s %s\n' % (amount, meal))

Python: creating a dictionary that writes high scores to a file

First: you don't have to code this for me, unless you're a super awesome nice guy. But since you're all great at programming and understand it so much better than me and all, it might just be easier (since it's probably not too many lines of code) than writing paragraph after paragraph trying to make me understand it.
So - I need to make a list of high scores that updates itself upon new entries. So here it goes:
First step - done
I have player-entered input, which has been taken as a data for a few calculations:
import time
import datetime
print "Current time:", time1.strftime("%d.%m.%Y, %H:%M")
time1 = datetime.datetime.now()
a = raw_input("Enter weight: ")
b = raw_input("Enter height: ")
c = a/b
Second step - making high score list
Here, I would need some sort of a dictionary or a thing that would read the previous entries and check if the score (c) is (at least) better than the score of the last one in "high scores", and if it is, it would prompt you to enter your name.
After you entered your name, it would post your name, your a, b, c, and time in a high score list.
This is what I came up with, and it definitely doesn't work:
list = [("CPU", 200, 100, 2, time1)]
player = "CPU"
a = 200
b = 100
c = 2
time1 = "20.12.2012, 21:38"
list.append((player, a, b, c, time1))
list.sort()
import pickle
scores = open("scores", "w")
pickle.dump(list[-5:], scores)
scores.close()
scores = open("scores", "r")
oldscores = pickle.load(scores)
scores.close()
print oldscores()
I know I did something terribly stupid, but anyways, thanks for reading this and I hope you can help me out with this one. :-)

First, don't use list as a variable name. It shadows the built-in list object. Second, avoid using just plain date strings, since it is much easier to work with datetime objects, which support proper comparisons and easy conversions.
Here is a full example of your code, with individual functions to help divide up the steps. I am trying not to use any more advanced modules or functionality, since you are obviously just learning:
import os
import datetime
import cPickle
# just a constants we can use to define our score file location
SCORES_FILE = "scores.pickle"
def get_user_data():
time1 = datetime.datetime.now()
print "Current time:", time1.strftime("%d.%m.%Y, %H:%M")
a = None
while True:
a = raw_input("Enter weight: ")
try:
a = float(a)
except:
continue
else:
break
b = None
while True:
b = raw_input("Enter height: ")
try:
b = float(b)
except:
continue
else:
break
c = a/b
return ['', a, b, c, time1]
def read_high_scores():
# initialize an empty score file if it does
# not exist already, and return an empty list
if not os.path.isfile(SCORES_FILE):
write_high_scores([])
return []
with open(SCORES_FILE, 'r') as f:
scores = cPickle.load(f)
return scores
def write_high_scores(scores):
with open(SCORES_FILE, 'w') as f:
cPickle.dump(scores, f)
def update_scores(newScore, highScores):
# reuse an anonymous function for looking
# up the `c` (4th item) score from the object
key = lambda item: item[3]
# make a local copy of the scores
highScores = highScores[:]
lowest = None
if highScores:
lowest = min(highScores, key=key)
# only add the new score if the high scores
# are empty, or it beats the lowest one
if lowest is None or (newScore[3] > lowest[3]):
newScore[0] = raw_input("Enter name: ")
highScores.append(newScore)
# take only the highest 5 scores and return them
highScores.sort(key=key, reverse=True)
return highScores[:5]
def print_high_scores(scores):
# loop over scores using enumerate to also
# get an int counter for printing
for i, score in enumerate(scores):
name, a, b, c, time1 = score
# #1 50.0 jdi (20.12.2012, 15:02)
print "#%d\t%s\t%s\t(%s)" % \
(i+1, c, name, time1.strftime("%d.%m.%Y, %H:%M"))
def main():
score = get_user_data()
highScores = read_high_scores()
highScores = update_scores(score, highScores)
write_high_scores(highScores)
print_high_scores(highScores)
if __name__ == "__main__":
main()
What it does now is only add new scores if there were no high scores or it beats the lowest. You could modify it to always add a new score if there are less than 5 previous scores, instead of requiring it to beat the lowest one. And then just perform the lowest check after the size of highscores >= 5

The first thing I noticed is that you did not tell list.sort() that the sorting should be based on the last element of each entry. By default, list.sort() will use Python's default sorting order, which will sort entries based on the first element of each entry (i.e. the name), then mode on to the second element, the third element and so on. So, you have to tell list.sort() which item to use for sorting:
from operator import itemgetter
[...]
list.sort(key=itemgetter(3))
This will sort entries based on the item with index 3 in each tuple, i.e. the fourth item.
Also, print oldscores() will definitely not work since oldscores is not a function, hence you cannot call it with the () operator. print oldscores is probably better.

Here are the things I notice.
These lines seem to be in the wrong order:
print "Current time:", time1.strftime("%d.%m.%Y, %H:%M")
time1 = datetime.datetime.now()
When the user enters the height and weight, they are going to be read in as strings, not integers, so you will get a TypeError on this line:
c = a/b
You could solve this by casting a and b to float like so:
a = float(raw_input("Enter weight: "))
But you'll probably need to wrap this in a try/catch block, in case the user puts in garbage, basically anything that can't be cast to a float. Put the whole thing in a while block until they get it right.
So, something like this:
b = None
while b == None:
try:
b = float(raw_input("Enter height: "))
except:
print "Weight should be entered using only digits, like '187'"
So, on to the second part, you shouldn't use list as a variable name, since it's a builtin, I'll use high_scores.
# Add one default entry to the list
high_scores = [("CPU", 200, 100, 2, "20.12.2012, 4:20")]
You say you want to check the player score against the high score, to see if it's best, but if that's the case, why a list? Why not just a single entry? Anyhow, that's confusing me, not sure if you really want a high score list, or just one high score.
So, let's just add the score, no matter what:
Assume you've gotten their name into the name variable.
high_score.append((name, a, b, c, time1))
Then apply the other answer from #Tamás

You definitely don't want a dictionary here. The whole point of a dictionary is to be able to map keys to values, without any sorting. What you want is a sorted list. And you've already got that.
Well, as Tamás points out, you've actually got a list sorted by the player name, not the score. On top of that, you want to sort in downward order, not upward. You could use the decorate-sort-undecorate pattern, or a key function, or whatever, but you need to do something. Also, you've put it in a variable named list, which is a very bad idea, because that's already the name of the list type.
Anyway, you can find out whether to add something into a sorted list, and where to insert it if so, using the bisect module in the standard library. But it's probably simpler to just use something like SortedCollection or blist.
Here's an example:
highscores = SortedCollection(scores, key=lambda x: -x[3])
Now, when you finish the game:
highscores.insert_right((player, a, b, newscore, time1))
del highscores[-1]
That's it. If you were actually not in the top 10, you'll be added at #11, then removed. If you were in the top 10, you'll be added, and the old #10 will now be #11 and be removed.
If you don't want to prepopulate the list with 10 fake scores the way old arcade games used to, just change it to this:
highscores.insert_right((player, a, b, newscore, time1))
del highscores[10:]
Now, if there were already 10 scores, when you get added, #11 will get deleted, but if there were only 3, nothing gets deleted, and now there are 4.
Meanwhile, I'm not sure why you're writing the new scores out to a pickle file, and then reading the same thing back in. You probably want to do the reading before adding the highscore to the list, and then do the writing after adding it.
You also asked how to "beautify the list". Well, there are three sides to that.
First of all, in the code, (player, a, b, c, time1) isn't very meaningful. Giving the variables better names would help, of course, but ultimately you still come down to the fact that when accessing list, you have to do entry[3] to get the score or entry[4] to get the time.
There are at least three ways to solve this:
Store a list (or SortedCollection) of dicts instead of tuples. The code gets a bit more verbose, but a lot more readable. You write {'player': player, 'height': a, 'weight': b, 'score': c, 'time': time1}, and then when accessing the list, you do entry['score'] instead of entry[3].
Use a collection of namedtuples. Now you can actually just insert ScoreEntry(player, a, b, c, time1), or you can insert ScoreEntry(player=player, height=a, weight=b, score=c, time=time1), whichever is more readable in a given case, and they both work the same way. And you can access entry.score or as entry[3], again using whichever is more readable.
Write an explicit class for score entries. This is pretty similar to the previous one, but there's more code to write, and you can't do indexed access anymore, but on the plus side you don't have to understand namedtuple.
Second, if you just print the entries, they look like a mess. The way to deal with that is string formatting. Instead of print scores, you do something like this:
print '\n'.join("{}: height {}, weight {}, score {} at {}".format(entry)
for entry in highscores)
If you're using a class or namedtuple instead of just a tuple, you can even format by name instead of by position, making the code much more readable.
Finally, the highscore file itself is an unreadable mess, because pickle is not meant for human consumption. If you want it to be human-readable, you have to pick a format, and write the code to serialize that format. Fortunately, the CSV format is pretty human-readable, and most of the code is already written for you in the csv module. (You may want to look at the DictReader and DictWriter classes, especially if you want to write a header line. Again, there's the tradeoff of a bit more code for a lot more readability.)

'Splitting' List into several Arrays

I'm trying to complete a Project that will show total annual sales from an specific list contained in a .txt file.
The list is formatted this way:
-lastname, firstname (string)
-45.7 (float)
-456.4 (float)
-345.5 (float)
-lastname2, firstname2 (string)
-3354.7 (float)
-54.6 (float)
-56.2 (float)
-lastname3, firstname3 (string)
-76.6 (float)
-34.2 (float)
-48.2 (float)
And so on.... Actually, 7 different "employees" followed by 12 set of "numbers" (months of the year)....but that example should suffice to give an idea of what I'm trying to do.
I need to output this specific information of every "employee"
-Name of employee
-Total Sum (sum of the 12 numbers in the list)
So my logic is taking me to this conclusion, but I don't know where to start:
Create 7 different arrays to store each "employee" data.
With this logic, I need to split the main list into independent arrays so I can work with them.
How can this be achieved? And also, if I don't have a predefined number of employees (but a defined format :: "Name" followed by 12 months of numbers)...how can I achieve this?
I'm sure I can figure once I get an idea how to "split" a list in different sections -- Every 13 lines?

Yes, at every thirteenth line you'd have the information of an employee.
However, instead of using twelve different lists, you can use a dictionary of lists, so that you wouldn't have to worry about the number of employees.
And you can either use a parameter on the number of lines directed to each employee.
You could do the following:
infile = open("file.txt", "rt")
employee = dict()
name = infile.readline().strip()
while name:
employee[name] = list()
for i in xrange(1, 12):
val = float(infile.readline().strip())
employee[name].append(val)
name = infile.readline().strip()
Some ways to access dictionary entries:
for name, months in employee.items():
print name
print months
for name in employee.keys():
print name
print employee[name]
for months in employee.values():
print months
for name, months in (employee.keys(), employee.values()):
print name
print months
The entire process goes as follows:
infile = open("file.txt", "rt")
employee = dict()
name = infile.readline().strip()
while name:
val = 0.0
for i in xrange(1, 12):
val += float(infile.readline().strip())
employee[name] = val
print ">>> Employee:", name, " -- salary:", str(employee[name])
name = infile.readline().strip()
Sorry for being round the bush, somehow (:

Here is option.
Not good, but still brute option.
summed = 0
with open("file.txt", "rt") as f:
print f.readline() # We print first line (first man)
for line in f:
# then we suppose every line is float.
try:
# convert to float
value = float(line.strip())
# add to sum
summed += value
# If it does not convert, then it is next person
except ValueError:
# print sum for previous person
print summed
# print new name
print line
# reset sum
summed = 0
# on end of file there is no errors, so we print lst result
print summed
since you need more flexibility, there is another option:
data = {} # dict: list of all values for person by person name
with open("file.txt", "rt") as f:
data_key = f.readline() # We remember first line (first man)
data[data_key] = [] # empty list of values
for line in f:
# then we suppose every line is float.
try:
# convert to float
value = float(line.strip())
# add to data
data[data_key].append(value)
# If it does not convert, then it is next person
except ValueError:
# next person's name
data_key = line
# new list
data[data_key] = []
Q: let's say that I want to print a '2% bonus' to employees that made more than 7000 in total sales (12 months)
for employee, stats in data.iteritems():
if sum(stats) > 7000:
print employee + " done 7000 in total sales! need 2% bonus"

I would not create 7 different arrays. I would create some sort of data structure to hold all the relevant information for one employee in one data type (this is python, but surely you can create data structures in python as well).
Then, as you process the data for each employee, all you have to do is iterate over one array of employee data elements. That way, it's much easier to keep track of the indices of the data (or maybe even eliminates the need to!).
This is especially helpful if you want to sort the data somehow. That way, you'd only have to sort one array instead of 7.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.