Dictionary in python for implementing the 37% Rule

Dictionary in python for implementing the 37% Rule - python

I am trying the famous 37% rule from the book- "Algorithms to live by" by Brian Christian.
The 37% Rule basically says that when you need to screen a range of options in a limited amount of time - be they candidates for a job, new apartments, or potential romantic partners - the best time to make a decision is when you've looked at 37% of those options.
At that point in a selection process, you'll have gathered enough information to make an informed decision, but you won't have wasted too much time looking at more options than necessary. At the 37% mark, you're in a good place to pick the best of the bunch.
A common thought experiment to demonstrate this theory - developed by non-PC math guys in the 1960s - is called "The Secretary Problem."
The program is running but I wanted to start to consider selecting the candidates after 37% of the candidates. Since I am using dictionary, I do not get to access the elements after a specified number of candidates. How can I make this possible?
import matplotlib.pyplot as plt
# for visualising scores
def initiate(candidates):
print("Total candidates are : ",len(candidates))
lookup_num=int(len(candidates) *0.37)
#finds 37% of the candidates
average=lookup(candidates)
# returns average of lookup phase
chosen=select_cad(candidates,lookup_num,average)
# selects a candidate based on lookUp average
print("The chosen candidate is : {} ".format(chosen))
def lookup(candidates):
average_score=0
for cands,score in candidates.items():
average_score+=score
average_score=int(average_score/len(candidates))
print("The average score in lookup is : ",average_score)
#return the average score to average local variable
return average_score
def select_cad(candidates,lookup_num,average):
for cands,score in candidates.items():
if(score>average):
return cands
else:
continue
print("Something went wrong!")
quit
candidates={"Husain":85, "Chirag":94 ,"Asim":70,"Ankit":65 ,"Saiteja":65 ,"Absar":75 ,"Premraj":70 ,"Sagar":75 ,"Himani":75 ,"Parth":76 ,"Sumedha":70 ,"Revati":65 ,"Sageer":65 ,"Noorjahan":60 ,"Muzammil":65 ,"Shifa":56 , "Dipti":65 , "Dheeraj":70 }
initiate(candidates)
plt.bar(range(len(candidates)), list(candidates.values()), align='center', color='green')
plt.xticks(range(len(candidates)), list(candidates.keys()))
plt.show()
How can I make it more flexible to update the average score even in selection phase?

Just read about this "Rule of 37%" so I hope I understood it correctly. I would implement something like that:
import random
def rule_of_37(candidates):
# first I date random 37% of the candidates
who_i_date = random.sample(list(candidates), int(len(candidates)*.37))
print("I meet those 37% of the candidates", who_i_date)
# then I calculate their average score
average = sum(candidates[name] for name in who_i_date) / len(who_i_date)
print("The average score was", average)
# then I settle with the next person that has a score higher than the average (obviously I cannot re-date candidates)
# hopefully there is still someone with an higher score than average...
try:
who_i_marry = next(name
for name, score in candidates.items()
if name not in who_i_date
and score > average)
print("I end up with", who_i_marry, "who has a score of", candidates[who_i_marry])
except StopIteration:
print("I end up all alone, there was nobody left with an higher score than", average, "...")
candidates={"Husain":85, "Chirag":94 ,"Asim":70,"Ankit":65 ,"Saiteja":65 ,"Absar":75 ,"Premraj":70 ,"Sagar":75 ,"Himani":75 ,"Parth":76 ,"Sumedha":70 ,"Revati":65 ,"Sageer":65 ,"Noorjahan":60 ,"Muzammil":65 ,"Shifa":56 , "Dipti":65 , "Dheeraj":70 }
rule_of_37(candidates)
Example execution (yours may vary since the first 37% candidates are picked at random):
I meet those 37% of the candidates ['Dipti', 'Chirag', 'Revati', 'Sumedha', 'Dhe
eraj', 'Muzammil']
The average score was 71.5
I end up with Husain who has a score of 85
If you want to select the first candidates yourself instead of relying on random, you can simply replace who_i_date by your pre-selected list:
who_i_date = ['Husain', 'Chirag', 'Asim', 'Ankit', 'Saiteja', 'Absar']
But then the other 63% will be arbitrarily ordered so you may not always select the same one (unless you use Python 3.6+ which keeps dicts in order by default). If you want to date the remaining 63% in order, you have to iterate over a list of the candidates names rather than on the dict that maps names to scores.
I leave that up to you.

You can use lookup_num along with numpy to simulate what candidates are "seen" within your function and then calculate the average score. This function will randomly select lookup_num number of candidates from your dictionary (without replacement). Using that subset the average_score is calculated. The function will return the average score along with the dictionary of "seen" candidates to determine who was the best candidate from the 37% subset.
def lookup(candidates,lookup_num):
# Randomly select lookup_num candidates without replacement
seen_names = np.random.choice(candidates.keys(), size=lookup_num, replace=False)
# Create a dictionary with the scores from the seen candidates.
seen = {k: v for (k, v) in candidates.items() if k in seen_names}
# Calculate the average score for the candidates who were seen
average_score = sum([v for (k, v) in seen.items()]) / float(lookup_num)
return seen, average_score
Your code calling lookup would become:
seen, average_score=lookup(candidates,lookup_num)
With the average_score and the list of candidates who were seen you can compare that to the rest of the candidates and apply your decision criteria for choosing the best candidate.

I've updated my previous code with a few variables and used lookup_num for iterating. Still using the unordered dictionary and it works like a charm. Check it out.
import matplotlib.pyplot as plt
# for visualising scores
def initiate(candidates):
print("Total candidates are : ",len(candidates))
lookup_num=int(len(candidates) *0.37)
#finds 37% of the candidates
average, lookup_candidates=lookup(candidates,lookup_num)
# returns average of lookup phase
chosen=select_cad(candidates,lookup_num,average,lookup_candidates)
# selects a candidate based on lookUp average
print("The chosen candidate is : {} ".format(chosen))
def lookup(candidates,lookup_num):
average_score=0
count=0
lookup_candidates=[]
for cands,score in candidates.items():
if(not count>=lookup_num):
lookup_candidates.append(cands)
average_score+=score
count+=1
average_score=int(average_score/count)
print("Look Up candidates are : ",lookup_candidates)
print("The average score in lookup is : ",average_score)
#return the average score to average local variable
return average_score,lookup_candidates
def select_cad(candidates,lookup_num,average,lookup_candidates):
for cands,score in candidates.items():
if(score>average and cands not in lookup_candidates):
#because 37% rule does not allows us to go back to select a candidate from the lookup phase
return cands
#return selected candidate to chosen variable
else:
continue
print("Something went wrong!")
quit
candidates={"Husain":85, "Chirag":94 ,"Asim":70,"Ankit":65 ,"Saiteja":65 ,"Absar":75 ,"Premraj":70 ,"Sagar":75 ,"Himani":75 ,"Parth":76 ,"Sumedha":70 ,"Revati":65 ,"Sageer":65 ,"Noorjahan":60 ,"Muzammil":65 ,"Shifa":56 , "Dipti":65 , "Dheeraj":70 }
initiate(candidates)
plt.bar(range(len(candidates)), list(candidates.values()), align='center', color='green')
plt.xticks(range(len(candidates)), list(candidates.keys()))
plt.xlabel("Candidates")
plt.ylabel("Score")
plt.title("37% Algorithm")
plt.show()

Related

How to set up a linear programming model (transportation problem) using python/PuLp

I am working on a transportation/replenishment model wherein I need to solve for lowest cost. The variables are:
Warehouses - several possible origin points of a shipment.
Items - in this example I only use two items. Each Item-Store combination has a unique demand value.
Inventory - available inventory for each 'Item' in each 'Warehouse'
Stores - the destination point for each shipment. In this example I only use two Stores.
Costs - unique costs for each Warehouse-Item-Store combination, which will be used to solve for lowest cost.
Demand - the quantity of each 'Item' that each 'Store' wants to receive; the model should fulfill 100% unless inventory is not available.
I am not very experienced with Python. It seems that I am somewhat close, however, I have a problem I haven't been able to fix yet: if Inventory is too low to fulfill all Demand, the model will break and return an "infeasible" result. Instead of this, I want the model to satisfy Demand until Inventory reaches zero and then return the optimized results up to that point. I understand that the result I am getting now is because I have set fulfilled qty equal to demand in one of my constraints, but I am not sure how to modify/fix it.
Here is the code so far - this is a result of much Google searching and sort of combining bits and pieces of code together like Dr. Frankenstein - if anything in here looks stupid please let me know. With the current inputs this will not work since Inventory does not satisfy Demand, but it seems to work if Inventory is higher (e.g. change Store1-SKU_B demand from 250 to 50)
from pulp import *
import pandas as pd
# Creates a list of all the supply nodes
warehouses = ["WHS_1","WHS_2","WHS_3"]
# Creates a dictionary for Inventory by Node-SKU
inventory = {"WHS_1": {"SKU_A":50,"SKU_B":100},
"WHS_2": {"SKU_A":50,"SKU_B":75} ,
"WHS_3": {"SKU_A":150,"SKU_B":25} ,
}
# Store list
stores = ["Store1","Store2"]
# SKU list
items = ["SKU_A","SKU_B"]
# Creates a dictionary for the number of units of demand for each Store-SKU
demand = {
"Store1": {"SKU_A":100,"SKU_B":250},
"Store2": {"SKU_A":100,"SKU_B":50},
}
# Creates a dictionary for the lane cost for each Node-Store-SKU
costs = {
"WHS_1": {"Store1": {"SKU_A":10.50,"SKU_B":3.75},
"Store2": {"SKU_A":15.01,"SKU_B":5.15}},
"WHS_2": {"Store1": {"SKU_A":9.69,"SKU_B":3.45},
"Store2": {"SKU_A":17.50,"SKU_B":6.06}},
"WHS_3": {"Store1": {"SKU_A":12.12,"SKU_B":5.15},
"Store2": {"SKU_A":16.16,"SKU_B":7.07}},
}
# Creates the 'prob' variable to contain the problem data
prob = LpProblem("StoreAllocation", LpMinimize)
# Creates a list of tuples containing all the possible routes for transport
routes = [(w, s, i) for w in warehouses for s in stores for i in items]
# A dictionary called 'Vars' is created to contain the referenced variables(the routes)
vars = LpVariable.dicts("Route", (warehouses, stores, items), 0, None, LpInteger)
# The objective function is added to 'prob' first
prob += (
lpSum([vars[w][s][i] * costs[w][s][i] for (w, s, i) in routes]),
"Sum_of_Transporting_Costs",
)
# Supply constraint, must not exceed Node Inventory
for w in warehouses:
for i in items:
prob += (
lpSum([vars[w][s][i] for s in stores]) <= inventory[w][i],
f"Sum_of_Products_out_of_Warehouse_{w}{i}",
)
# Supply constraint, supply to equal demand
for s in stores:
for i in items:
prob += (
lpSum([vars[w][s][i] for w in warehouses]) == demand[s][i],
f"Sum_of_Products_into_Store{s}{i}",
)
# The problem data is written to an .lp file
prob.writeLP("TestProblem.lp")
prob.solve()
# The status of the solution is printed to the screen
print("Status:", LpStatus[prob.status])
# Each of the variables is printed with it's resolved optimum value
for v in prob.variables():
print(v.name, "=", v.varValue)
# The optimised objective function value is printed to the screen
print("Total Cost of Fulfillment = ", value(prob.objective))

This is good. Your model is set up well. Let's talk about supply...
So this is a common transshipment model and you want to minimize cost, but the default answer is to ship nothing for a cost of zero, which is not good. As you know, you need upward pressure on the deliveries to meet demand, or at least do the best you can with the inventory on hand if demand > inventory.
The first "cheap and easy" thing to do is to reduce the aggregate deliveries of each product to what is available... across all stores. In your current code you are trying to force the deliveries == demand, which may not be possible. So you can take a step back and just say "deliver the aggregate demand, or at least all of the inventory". In pseudocode that would be something like:
total_delivery[sku] = min(all inventory, demand)
You could do the same for the other SKU's and then just sum all of the deliveries by SKU across all warehouses and destintations and force:
for SKU in SKUs:
sum(deliver[w, s, sku] for w in warehouses for s in stores) >= total_delivery[sku]
Realize that the parameter total_delivery is NOT a variable, it is discernible from the data before doing anything.
The above will make the model run, but there are issues. The model will likely "overdeliver" to some sites because we are aggregating the demand. So if you have 100 of something and split demand of 50/50 in 2 sites, it will deliver 100 to the cheapest site.... not good. So you need to add a constraint to limit the delivery to each site to the demand, regardless of source. Something like:
for s in stores:
for sku in skus:
sum(deliver[w, s, sku] for w in warehouses) <= demand[s, sku]
The addition of those should make your model run. The result (if inventory is short) will be disproportionate delivery to the cheap sites. Perhaps that is OK. Balancing it is a little complicated.
...
Regarding your model, you have your variable constructed as a nested list... That is why you need to index it as vars[w][s][i]. That is perfectly fine, but I find it much easier to tuple-index the variables and you already have the basis set routes to work with. So I would:
deliver = LpVariable.dicts("deliver", routes, 0, None, LpInteger)
then you can index it like I have in the examples above...

Calculate probability of a flush in poker

I have the code to keep going through a loop until a flush is made.
now I am trying to make it where I use count to show how many hands are dealt then divide by one to get the probability.
For the code i have right now using count it returns it as 0
from collections import namedtuple
from random import shuffle
Card = namedtuple("Card", "suit, rank")
class Deck:
suits = '♦♥♠♣'
ranks = '23456789JQKA'
def __init__(self):
self.cards = [Card(suit, rank) for suit in self.suits for rank in self.ranks]
shuffle(self.cards)
def deal(self, amount):
return tuple(self.cards.pop() for _ in range(amount))
flush = False
count = 0
while not flush:
deck = Deck()
stop = False
while len(deck.cards) > 5:
hand = deck.deal(5)
# (Card(suit='♣', rank='7'), Card(suit='♠', rank='2'), Card(suit='♥', rank='4'), Card(suit='♥', rank='K'), Card(suit='♣', rank='3'))
if len(set(card.suit for card in hand)) > 1:
#print(f"No Flush: {hand}")
continue
print(f"Yay, it's a Flush: {hand}")
flush = True
break
if flush:
break
else:
count +=1
print(f'Count is {count}')
There is a little more code at the top used for init methods if you need that too let me know

Your code (and what is available in #Mason's answer) will estimate the probability of eventually getting your first flush. To estimate the probability of getting a flush in general, which I believe is what you're after, you have to run that experiment many thousands of times over. In practice this is called a Monte Carlo simulation.
Side note: When I began learning about Monte Carlos I thought they were a sort of "magical", mysteriously complex thing... mostly because their name sounds so exotic. Don't be fooled. "Monte Carlo" is just an overly fancy and arbitrary name for "simulation". They can be quite elementary.
Even so, simulations are kind of magical because you can use them to brute force a solution out of a complex system even when a mathematical model of that system is hard to come by. Say, for example, you don't have a firm understanding of combination or permutation math - which would produce the exact answer to your question "What are the odds of getting a flush?". We can run many simulations of your card game to figure out what that probability would be to a high degree of certainty. I've done that below (commented out parts of your original code that weren't needed):
from collections import namedtuple
from random import shuffle
import pandas as pd
#%% What is the likelyhood of getting flush? Mathematical derivation
""" A flush consists of five cards which are all of the same suit.
We must remember that there are four suits each with a total of 13 cards.
Thus a flush is a combination of five cards from a total of 13 of the same suit.
This is done in C(13, 5) = 1287 ways.
Since there are four different suits, there are a total of 4 x 1287 = 5148 flushes possible.
Some of these flushes have already been counted as higher ranked hands.
We must subtract the number of straight flushes and royal flushes from 5148 in order to
obtain flushes that are not of a higher rank.
There are 36 straight flushes and 4 royal flushes.
We must make sure not to double count these hands.
This means that there are 5148 – 40 = 5108 flushes that are not of a higher rank.
We can now calculate the probability of a flush as 5108/2,598,960 = 0.1965%.
This probability is approximately 1/509. So in the long run, one out of every 509 hands is a flush."""
"SOURCE: https://www.thoughtco.com/probability-of-a-flush-3126591"
mathematically_derived_flush_probability = 5108/2598960 * 100
#%% What is the likelyhood of getting flush? Monte Carlo derivation
Card = namedtuple("Card", "suit, rank")
class Deck:
suits = '♦♥♠♣'
ranks = '23456789JQKA'
def __init__(self):
self.cards = [Card(suit, rank) for suit in self.suits for rank in self.ranks]
shuffle(self.cards)
def deal(self, amount):
return tuple(self.cards.pop() for _ in range(amount))
#flush = False
hand_count = 0
flush_count = 0
flush_cutoff = 150 # Increase this number to run the simulation over more hands.
column_names = ['hand_count', 'flush_count', 'flush_probability', 'estimation_error']
hand_results = pd.DataFrame(columns=column_names)
while flush_count < flush_cutoff:
deck = Deck()
while len(deck.cards) > 5:
hand_count +=1
hand = deck.deal(5)
# (Card(suit='♣', rank='7'), Card(suit='♠', rank='2'), Card(suit='♥', rank='4'), Card(suit='♥', rank='K'), Card(suit='♣', rank='3'))
if len(set(card.suit for card in hand)) == 1:
# print(f"Yay, it's a Flush: {hand}")
flush_count +=1
# break
# else:
# print(f"No Flush: {hand}")
monte_carlo_derived_flush_probability = flush_count / hand_count * 100
estimation_error = (monte_carlo_derived_flush_probability - mathematically_derived_flush_probability) / mathematically_derived_flush_probability * 100
hand_df = pd.DataFrame([[hand_count,flush_count,monte_carlo_derived_flush_probability, estimation_error]], columns=column_names)
hand_results = hand_results.append(hand_df)
#%% Analyze results
# Show how each consecutive hand helps us estimate the flush probability
hand_results.plot.line('hand_count','flush_probability').axhline(y=mathematically_derived_flush_probability,color='r')
# As the number of hands (experiments) increases, our estimation of the actual probability gets better.
# Below the error gets closer to 0 percent as the number of hands increases.
hand_results.plot.line('hand_count','estimation_error').axhline(y=0,color='black')
#%% Memory usage
print("Memory used to store all %s runs: %s megabytes" % (len(hand_results),round(hand_results.memory_usage(index=True,deep=True).sum()/1000000, 1)))
In this particular case, thanks to math, we could have just confidently derived the probability of getting a flush as 0.1965%. To prove that our simulation arrived at the correct answer we can compare its output after 80,000 hands:
As you can see, our simulated flush_probability (in blue) approaches the mathematically derived probability (in black).
Similarly, below is a plot of the estimation_error between the simulated probability and the mathematically derived value. As you can see, the estimation error was more than 100% off in the early runs of the simulation but gradually rose to within 5% of the error.
If you were to run the simulation for, say, twice the number of hands, then we would see that the blue and red lines eventually overlap with the black horizontal line in both charts - signifying that the simulated answer becomes equivalent to the mathematically derived answer.
To simulate or not to simulate?
Finally, you might wonder,
"If I can generate a precise answer to a problem by simulating it, then why bother with all the complicated math in the first place?"
The answer is, as with just about any decision in life, "trade offs".
In our example, we could run the simulation over enough hands to get a precise answer with a high degree of confidence. However, if one is running a simulation because they don't know the answer (which is often the case), then one needs to answer another question,
"How long do I run the simulation to be confident I have the right answer?"
The answer to that seems simple:
"Run it for a long time."
Eventually your estimated outputs could converge to a single value such that outputs from additional simulations don't drastically change from prior runs. The problem here is that in some cases, depending on the complexity of the system you're simulating, seemingly convergent output may be a temporary phenomena. That is, if you ran a hundred thousand more simulations, you might begin to see your outputs diverge from what you thought was your stable answer. In a different scenario, despite having run tens of millions of simulations, it could happen that an output still hasn't converged. Do you have the time to program and run the simulation? Or would a mathematical approximation get you there sooner?
There is yet another concern:
*"What is the cost?"
Consumer computers are relatively cheap today but 30 years ago they cost $4,000 to $9,000 in 2019 dollars. In comparison, a TI89 only cost $215 (again, in 2019 dollars). So if you were asking this question back in 1990 and you were good with probability math, you could have saved $3,800 by using a TI89. Cost is just as important today: simulating self-driving cars and protein folding can burn many millions of dollars.
Finally, mission critical applications may require both a simulation and a mathematical model to cross check the results of both approaches. A tidy example of this is when Matt Parker of StandUpMaths calculated the odds of landing on any property in the game of Monopoly by simulation and confirmed those results with Hannah Fry's mathematical model of the same game.

I think this should work for you. It depends on how your Deck() is defined though, I guess. I tried to leave your code in a similar state to how you had written it, but had to make some changes so you wouldn't get errors. I also didn't actually run it, since I don't have Deck defined.
flush = False
count = 0
while not flush:
deck = Deck()
stop = False
while len(deck.cards) > 5:
hand = deck.deal(5)
# (Card(suit='♣', rank='7'), Card(suit='♠', rank='2'), Card(suit='♥', rank='4'), Card(suit='♥', rank='K'), Card(suit='♣', rank='3'))
if len(set(card.suit for card in hand)) > 1:
print(f"No Flush: {hand}")
else:
print(f"Yay, it's a Flush: {hand}")
flush = True
break
count +=1
print(f'Count is {count}')
But it will not give you the probability of getting a flush, and you'll honestly probably run out of cards in deck before you get a flush in almost every run...
I would consider changing the code to this so, to take out some redundancies.
flush = False
count = 0
while not flush:
deck = Deck()
while len(deck.cards) > 5:
hand = deck.deal(5)
# (Card(suit='♣', rank='7'), Card(suit='♠', rank='2'), Card(suit='♥', rank='4'), Card(suit='♥', rank='K'), Card(suit='♣', rank='3'))
if len(set(card.suit for card in hand)) == 1:
print(f"Yay, it's a Flush: {hand}")
flush = True
break
else:
print(f"No Flush: {hand}")
count +=1
print(f'Count is {count}')

What is the easiest way "pair" strings in one list with integers in another (in python)?

I'm working on a school project which has to store the names of people and their respective score on a test in a list so that I can manipulate it to find averages as well as printing out each persons score with their name. Relatively new to Python so any help is appreciated :)

I would recommend using a dictionary. This pairs keys (the name of students) to values (the score on a test). Here is an example below that gets you the output that you would want.
import math
student_scores = {}
student_scores['Rick'] = 89
student_scores['Pat'] = 79
student_scores['Larry'] = 82
score_list = []
for name, score in student_scores.items():
score_list.append(score)
print(name.title() + "'s score was: " + str(score) + '%')
sum_scores = sum(score_list)
division_scores = len(score_list)
average_score = sum_scores / division_scores
print('The average score was {0:.2f}%'.format(average_score))
I created an empty dictionary that you will use to add student names and scores to a list. So in the dictionary (student_scores) The student name 'Rick' will be a key, and the score 89 will the value. I do this for 2 additional students, pairing their name up with the score that they received.
I create an empty list called score_list. You'll use this list later to add he sum of all scores, and divide by the number of total scores to get an average score for your test.
We start a for loop that iterates over each key and value in your dictionary. For each score, we append it to the empty score list. For each name and score, we print a message showing what the student got on the test.
Now that we have appended the scores to the dictionary we can use the sum method to get the sum of all scores in your score list. We put it in a variable called sum_scores. We also get the number of scores in your list by finding the length of the list (which will be 3 in this case since I put 3 scores in it). We will store that in a variable called division_scores (since I am dividing the sum of all scores by the number of scores recorded). We create a variable called average_score which is the result of the sum of scores divided by the total number of observations.
We then print what the average score was using the .format() method. We just format the average score so that you get it to extend two decimal places {0:.2f}%.
Your output is as follows:
Rick's score was: 89%
Pat's score was: 79%
Larry's score was: 82%
The average score was 83.33%

The above answer is a great data structure for pairing strings. It'll set you on the right track for enumerating scores, averages, etc in simple cases.
Another way to store relationships is with classes (or tuples, at the bottom!) There's a rough sketch of an OOP approach below.
The most important parts are
The properties of the ExamAttempt class store the information (names, scores)
In the Exam.record_attempt method, a new ExamAttempt object is created from the ExamAttempt class and added to the list of attempts on the Exam object.
From here, you could easily add other features. You'd probably want to model a Question and Answer, and maybe a Student object too, if you're going all out. If you store questions and answers, as well as which answer each student selected, you can start doing things like throwing out questions, grading on a curve, discovering questions to throw out, etc. The OOP approach makes it easier to extend functionality like plotting all kinds of fancy graphs, export to CSV or Excel, and so on.
Not all of the code below is necessary.. it can definitely be simplified a little, or reimagined entirely, but hopefully this should give you enough to start looking down that path. Even if it seems complicated now, it's not that bad, and it's what you'll want to be doing eventually (with Python, anyway!)
class ExamAttempt:
def __init__(self, id, name, correct, total):
self.id = id
self.name = name
self.correct = correct
self.total = total
self.score = (self.correct / float(self.total))
def __repr__(self):
return "<ExamAttempt: Id={}, Student={}, Score={}>".format(self.id, self.name, self.score)
class Exam:
def __init__(self, name, questions):
self.name = name
self.attempts = []
self.questions = questions
self.num_questions = len(questions)
def __str__(self):
return "<Exam ({})>".format(self.name)
def load(self, filename):
pass
def saveAttemptsToFile(self, filename):
pass
def record_attempt(self, student_name, num_correct):
id = len(self.attempts) + 1
self.attempts.append(
ExamAttempt(id, student_name, num_correct, self.num_questions))
def get_student_attempt(self, student_name):
for att in self.attempts:
if student_name == att.name:
return att
def get_average_score(self):
return "homework"
def get_results_by_score(self):
return sorted(self.attempts, key=lambda x: x.score, reverse=True)
def get_attempts_by_name(self):
return sorted(self.attempts, key=lambda x: x.name)
if __name__ == '__main__':
questions = ['Question?' for i in range(100)] # Generate 100 "questions" = 100%
exam = Exam('Programming 101', questions)
data = [('Rick', 89), ('Pat', 79), ('Larry', 82)]
for name, correct in data:
exam.record_attempt(name, correct)
for attempt in exam.get_results_by_score():
print("{} scored {}".format(attempt.name, attempt.score))

Python function to get lowest and average score

I am a newbie to python programming. I am working on a class homework and got the code below so far. The next step that am struggling with is to write a function the would show / print the lowest score and average score. Any direction would be much appreciated.
scores = """Aturing:Mark$86:
Inewton:Mark$67.5:
Cdarwin:Mark$90:
Fnightingale:Mark$99:
Cvraman:Mark$10:"""
students = {}
for studentdata in scores.split('\n'):
data = studentdata.split(':')
name = data[0]
students[name] = {}
for class_data in data[1:]:
if class_data:
Mark,class_score = class_data.split('$')
students[name][Mark] = class_score
def Grade_Show(student,Mark):
if student in students:
if Mark in students[student]:
print "Student %s got %s in the assignment %s" % (student,students[student][Mark],Mark)
else:
print "subject %s not found for student %s" % (Mark,student)
else:
print "student %s not found" % (student)
#do some testing
Grade_Show("Inewton","Mark")

Testing with: scores = {'alex': 1, 'dave': 1, 'mike': 2};
Firstly, to find the lowest score, use the min() function.
So:
min_keys = [k for k, x in scores.items() if not any(y < x for y in scores.values())]
print('Lowest score:', str(min(scores.values())) + '.', 'Achieved by: ')
for student in min_keys:
print(student)
Output:
Lowest score: 1. Achieved by:
alex
dave
Secondly, assuming you are looking for the mean average, you would do this:
print('The average score was:', str(sum(scores.values()) / len(scores)))
Output:
The average score was: 1.3333333333333333
Hope I helped!- All you need to do now is create a function containing that code, with a parameter called data. That way you can have multiple dictionaries to represent different classes or tests. You would replace all instances of score in the code with data.
Also, the 'minimum score' code could be easily modified to give the maximum score. Finally, depending on the size of your program you could store the output in a variable rather than using a print statement so you can recall it later. This would also mean that you should return the result, not print it.

The next step that am struggling with is to write a function the would
show / print the lowest score and average score.
Step 1:
Can you iterate through your data structure (students) and print only the scores? If you can do that, then you should be able to run through and find the lowest score.
To find the lowest score, start with some imagined maximum possible value (set some variable equal to 100, for example, if that's the highest possible) and iterate through all the scores (for score in score..., etc.), testing to see if each value you get is lower than the variable you created.
If it is lower, make the variable you created equal to that lower value. After that, it will continue iterating to see if any new value is less than this new 'lowest' value. By the time it reaches the end, it should have provided you with the lowest value.
One tricky part is making sure to print both the name and lowest value, if that's what the question requires.
Step: 2
To solve the average problem, you'll do to something similar where you iterate over the scores, add them to a new data structure and then figure out how to take an average of them.

Summing Only First Values When in Dictionary When Keys Correspond To Lists (Python)

I'm working in python 2.7. I have a list of teams in the following dictionary:
NL = {'Phillies': [662, 476], 'Braves': [610, 550], 'Mets': [656, 687]}
The first value in the list is the amount of runs scored the team has, and the second is the amount of runs that team has given up.
I'm using this code to determine the Pythagorean winning percentage of each team, but I would also like to be able to have the function calculate the total number of runs scored and allowed by the group as a whole.
Right now I'm using:
Pythag(league):
for team, scores in league.iteritems():
runs_scored = float(scores[0])
runs_allowed = float(scores[1])
win_percentage = (runs_scored**2)/((runs_scored**2)+(runs_allowed**2))
total_runs_scored = sum(scores[0] for team in league)
print '%s: %f' % (team, win_percentage)
print '%s: %f' % ('League Total:', total_runs_scored)
I'm not sure exactly what is going on with the sum function, but instead of getting one value, I'm getting a different value over each iteration of the team and win_percentage, and it's not the same value...
Ideally, the function would just return one value for the sum of the runs scored for each team in the dictionary.
Thanks for any help.

If you want to have the running total available, or don't want to iterate over league twice, you can do:
def Pythag(league):
total_runs_scored = 0
for team, scores in league.iteritems():
# other stuff
total_runs_scored += scores[0]
# other stuff
# runs scored by all teams up to this point
print 'League Running Total of Runs Scored: %f' % (total_runs_scored,)
# outside the loop, so total runs scored in the league.
# will be the same as the last one in the loop
print 'League Total Runs Scored: %f' % (total_runs_scored,)
Remember that inside the loop you're talking about a single team, so you don't need to do a sum to get the runs scored by that team, you instead need to add it to the runs scored by all the previous teams, that is, the scores[0] from the previous iterations of the loop.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Dictionary in python for implementing the 37% Rule - python

Related

How to set up a linear programming model (transportation problem) using python/PuLp

Calculate probability of a flush in poker

What is the easiest way "pair" strings in one list with integers in another (in python)?

Python function to get lowest and average score

Summing Only First Values When in Dictionary When Keys Correspond To Lists (Python)

Categories

Resources