Replacing items in a list with items from another list - python

I'm trying to create a calculator for a User's weighted GPA. I'm using PyautoGUI to ask the user for their grades and type of class they're taking. But I want to be able to take that User input and essentially remap it to a different value.
class GPA():
grades = []
classtypes = []
your_format = confirm(text='Choose your grade format: ', title='',
buttons=['LETTERS', 'PERCENTAGE', 'QUIT'])
classnum = int(prompt("Enter the number of classes you have: "))
for i in range(classnum):
grade = prompt(text='Enter your grade for the course
:'.format(name)).lower()
classtype = prompt(text='Enter the type of Course (Ex. Regular, AP, Honors): ').lower()
classtypes.append(classtype)
grades.append(grade)
def __init__(self):
self.gradeMap = {'a+': 4.0, 'a': 4.0, 'a-': 3.7, 'b+': 3.3, 'b': 3.0,'b-': 2.7,
'c+': 2.3, 'c': 2.0, 'c-': 1.7, 'd+': 1.3, 'd': 1.0, 'f': 0.0}
self.weightMap = {'advanced placement': 1.0, 'ap': 1.0, 'honors': 0.5,'regular': 0.0}

Based on the gradeMap dictionary you have defined you could do something with what's called a list comprehension.
An example of what I'm talking about done using the Python interpreter:
>>> grades = ['a', 'c-', 'c']
>>> gradeMap = {'a+': 4.0, 'a': 4.0, 'a-': 3.7, 'b+': 3.3, 'b': 3.0,'b-': 2.7,
... 'c+': 2.3, 'c': 2.0, 'c-': 1.7, 'd+': 1.3, 'd': 1.0, 'f': 0.0}
>>> [gradeMap[grade] for grade in grades] #here's the list comprehension
[4.0, 1.7, 2.0]
I think the downside with this approach might be making sure the user only gives you a grade you have defined in your gradeMap otherwise it is going to give you a KeyError.
Another alternative would be to use map. map is slightly different in that it expects a function and an input list, and then applys that function over the input list.
An example with a very simple function that only works with a few grades:
>>> def convert_grade_to_points(grade):
... if grade == 'a':
... return 4.0
... elif grade == 'b':
... return 3.0
... else:
... return 0
...
>>> grades = ['a', 'b', 'b']
>>> map(convert_grade_to_points, grades)
[4.0, 3.0, 3.0]
This also suffers from the downside I mentioned earlier that the function you define has to handle the case where the user input an invalid grade.

You can replace items of the list in place.
for grade in gradeList:
if type is "PERCENTAGE":
grade = grade × some_factor # use your logic
elif type is "LETTERS":
grade="some other logic"

Related

Improve the perfomance of a code that uses for-loops

I am trying to create a list based on some data, but the code I am using is very slow when I run it on large data. So I suspect I am not using all of the Python power for this task. Is there a more efficient and faster way of doing this in Python?
Here an explanantion of the code:
You can think of this problem as a list of games (list_type) each with a list of participating teams and the scores for each team in the game (list_xx).For each of the pairs in the current game it first calculate the sum of the differences in score from the previous competitions (win_comp_past_difs); including only the pairs in the current game. Then it update each pair in the current game with the difference in scores. Using a defaultdict keeps track of the scores for each pair in each game and update this score as each game is played.
In the example below, based on some data, there are for-loops used to create a new variable list_zz.
The data and the for-loop code:
import pandas as pd
import numpy as np
from collections import defaultdict
from itertools import permutations
list_type = [['A', 'B'], ['B'], ['A', 'B', 'C', 'D', 'E'], ['B'], ['A', 'B', 'C'], ['A'], ['B', 'C'], ['A', 'B'], ['C', 'A', 'B'], ['A'], ['B', 'C']]
list_xx = [[1.0, 5.0], [3.0], [2.0, 7.0, 3.0, 1.0, 6.0], [3.0], [5.0, 2.0, 3.0], [1.0], [9.0, 3.0], [2.0, 7.0], [3.0, 6.0, 8.0], [2.0], [7.0, 9.0]]
list_zz= []
#for-loop
wd = defaultdict(float)
for i, x in zip(list_type, list_xx):
# staff 1
if len(i) == 1:
#print('NaN')
list_zz.append(np.nan)
continue
# Pairs and difference generator for current game (i)
pairs = list(permutations(i, 2))
dgen = (value[0] - value[1] for value in permutations(x, 2))
# Sum of differences from previous games incluiding only pair of teams in the current game
for team, result in zip(i, x):
win_comp_past_difs = sum(wd[key] for key in pairs if key[0] == team)
#print(win_comp_past_difs)
list_zz.append(win_comp_past_difs)
# Update pair differences for current game
for pair, diff in zip(pairs, dgen):
wd[pair] += diff
print(list_zz)
Which looks like this:
[0.0,
0.0,
nan,
-4.0,
4.0,
0.0,
0.0,
0.0,
nan,
-10.0,
13.0,
-3.0,
nan,
3.0,
-3.0,
-6.0,
6.0,
-10.0,
-10.0,
20.0,
nan,
14.0,
-14.0]
If you could elaborate on the code to make it more efficient and execute faster, I would really appreciate it.
Without reviewing the overall design of your code, one improvement pops out at me: move your code to a function.
As currently written, all of the variables you use are global variables. Due to the dynamic nature of the global namespace, Python must look up each global variable you use each and every time you use access it.(1) In CPython, this corresponds to a hash table lookup, which can be expensive, particularly if hash collisions are present.
In contrast, local variables can be known at compile time, and so are stored in a fixed-size array. Accessing these variables therefore only involves dereferencing a pointer, which is comparatively much faster.
With this principal in mind, you should be able to boost your performance (somewhere around a 40% drop in run time) by moving all you your code into a "main" function:
def main():
...
# Your code here
if __name__ == '__main__':
main()
(1) Source

How to get a value in a tuple in a dictionary?

I want to access the values in a tuple within a dictionary using a lambda function
I need to get average GPA for each subject by comparing the average grades of the students in that class
I have tried using a lambda but I could not figure it out.
grade = {'A': 4.0, 'B': 3.0, 'C': 2.0, 'D': 1.0, 'F' : 0.0}
subjects = {'math': {('Jack', 'A'),('Larry', 'C')}, 'English': {('Kevin', 'C'),('Tom','B')}}
def highestAverageOfSubjects(subjects):
return
The output needs to be ['math','English'] since average GPA of math which is 3.0 is greater then English 2.0 average GPA
You can easily sort everything by using sorted with a key function:
Grade = {'A': 4.0, 'B': 3.0, 'C': 2.0, 'D': 1.0, 'F' : 0.0}
subject = {'math': {('Jack', 'A'),('Larry', 'C')}, 'English': {('Kevin', 'C'),('Tom','B')}}
result = sorted(subject, key=lambda x: sum(Grade[g] for _, g in subject[x]) / len(subject[x]), reverse=True)
print(result)
Output:
['math','English']
If, as a secondary, you want to sort by the number of students:
result = sorted(subject, key=lambda x: (sum(Grade[g] for _, g in subject[x]) / len(subject[x]), len(subject[x])), reverse=True)
print(result)
One of the issues with the way you have implemented is that you have used a set as values in your subject dict. This means you have to range over each element. But once you have the element, that value would simply be indexed like elem[1].
For ex:
Grade = {'A': 4.0, 'B': 3.0, 'C': 2.0, 'D': 1.0, 'F' : 0.0}
subject = {'math': {('Jack', 'A'),('Larry', 'C')}, 'English': {('Kevin', 'C'),('Tom','B')}}
for elem in subject['math']:
print(elem[1])
Output:
C
A
If in the print above you just print(elem) then you'd see something like:
('Larry', 'C')
('Jack', 'A')
So this way you could easily extend your highAveSub(subject) implementation to get what you want.
To find the avg grade of a subject:
def highAveSub(subname):
total = 0
for elem in subject[subname]: #Because your values are of type set, not dict.
total = total + grade[elem[1]] #This is how you will cross-reference the numerical value of the grade. You could also simply use enums and I'll leave that to you to find out
avg = total / len(subject[subname])
return avg

Separating nested for loops in list comprehensions

Starting from this dataframe
import pandas as pd
df2 = pd.DataFrame({'t': ['a', 'a', 'a', 'b', 'b', 'b'],
'x': [1.1, 2.2, 3.3, 1.1, 2.2, 3.3],
'y': [1.0, 2.0, 3.0, 2.0, 3.0, 4.0]})
it's possible to simplify these nested for loops:
for t, df in df2.groupby('t'):
print("t:", t)
for d in df.to_dict(orient='records'):
print({'x': d['x'], 'y': d['y']})
by separating the inner loop into a function:
def handle(df):
for d in df.to_dict(orient='records'):
print({'x': d['x'], 'y': d['y']})
for t, df in df2.groupby('t'):
print("t:", t)
handle(df)
How might I similarly separate a nested list comprehension :
mydict = {
t: [{'x': d['x'], 'y': d['y']} for d in df.to_dict(orient='records')]
for t, df in df2.groupby(['t'])
}
into two separate loops?
I'm asking the question with just two levels of nesting, yet with just two nested loops the need is hardly critical. The motivations are:
By the time there are a few levels, the code becomes tough to read.
Developing and testing smaller blocks guards against (present and future) mistakes at more than the outer level.

python position frequency dictionary of letters in words

To efficiently get the frequencies of letters (given alphabet ABC in a dictionary in a string code I can make a function a-la (Python 3) :
def freq(code):
return{n: code.count(n)/float(len(code)) for n in 'ABC'}
Then
code='ABBBC'
freq(code)
Gives me
{'A': 0.2, 'C': 0.2, 'B': 0.6}
But how can I get the frequencies for each position along a list of strings of unequal lengths ? For instance mcode=['AAB', 'AA', 'ABC', ''] should give me a nested structure like a list of dict (where each dict is the frequency per position):
[{'A': 1.0, 'C': 0.0, 'B': 0.0},
{'A': 0.66, 'C': 0.0, 'B': 0.33},
{'A': 0.0, 'C': 0.5, 'B': 0.5}]
I cannot figure out how to do the frequencies per position across all strings, and wrap this in a list comprehension. Inspired by other SO for word counts e.g. the well discussed post Python: count frequency of words in a list I believed maybe the Counter module from collections might be a help.
Understand it like this - write the mcode strings on separate lines:
AAB
AA
ABC
Then what I need is the column-wise frequencies (AAA, AAB, BC) of the alphabet ABC in a list of dict where each list element is the frequencies of ABC per columns.
A much shorter solution:
from itertools import zip_longest
def freq(code):
l = len(code) - code.count(None)
return {n: code.count(n)/l for n in 'ABC'}
mcode=['AAB', 'AA', 'ABC', '']
results = [ freq(code) for code in zip_longest(*mcode) ]
print(results)
Example, the steps are shortly explained in comments. Counter of module collections is not used, because the mapping for a position also contains characters, that are not present at this position and the order of frequencies does not seem to matter.
def freq(*words):
# All dictionaries contain all characters as keys, even
# if a characters is not present at a position.
# Create a sorted list of characters in chars.
chars = set()
for word in words:
chars |= set(word)
chars = sorted(chars)
# Get the number of positions.
max_position = max(len(word) for word in words)
# Initialize the result list of dictionaries.
result = [
dict((char, 0) for char in chars)
for position in range(max_position)
]
# Count characters.
for word in words:
for position in range(len(word)):
result[position][word[position]] += 1
# Change to frequencies
for position in range(max_position):
count = sum(result[position].values())
for char in chars:
result[position][char] /= count # float(count) for Python 2
return result
# Testing
from pprint import pprint
mcode = ['AAB', 'AA', 'ABC', '']
pprint(freq(*mcode))
Result (Python 3):
[{'A': 1.0, 'B': 0.0, 'C': 0.0},
{'A': 0.6666666666666666, 'B': 0.3333333333333333, 'C': 0.0},
{'A': 0.0, 'B': 0.5, 'C': 0.5}]
In Python 3.6, the dictionaries are even sorted; earlier versions can use OrderedDict from collections instead of dict.
Your code isn't efficient at all :
You first need to define which letters you'd like to count
You need to parse the string for each distinct letter
You could just use Counter:
import itertools
from collections import Counter
mcode=['AAB', 'AA', 'ABC', '']
all_letters = set(''.join(mcode))
def freq(code):
code = [letter for letter in code if letter is not None]
n = len(code)
counter = Counter(code)
return {letter: counter[letter]/n for letter in all_letters}
print([freq(x) for x in itertools.zip_longest(*mcode)])
# [{'A': 1.0, 'C': 0.0, 'B': 0.0}, {'A': 0.6666666666666666, 'C': 0.0, 'B': 0.3333333333333333}, {'A': 0.0, 'C': 0.5, 'B': 0.5}]
For Python2, you could use itertools.izip_longest.

python code works on one file but fails on other

Hi all so I have this code, which prints out the minimum cost and restaurant id for the item/items. The customer doesnt want to visit multiple restaurants. So for example if he asks for "A,B" then the code should print shop which offers them both , instead of scattering the user requirement around different restaurants (even if some restaurant is offering it cheap).
Also if suppose the user asks for burger.Then if a certain restaurant 'X' is giving a "burger" for 4$, whereas another restaurant 'Y' is giving "burger+tuna+tofu" for $3, then we will tell the user to got for RESTAURANT 'Y', even if it has extra items apart from the 'burger' which user asked for, but we are happy to give them extra items as long as its cheap.
Everythings fine, but the code is strangely behaving differently on two input files(fails on input.csv but runs on input-2.csv) which are of same format, its giving correct output for one whereas fails for another. This is the only minute error I need your help to fix. Please help me , I guess I have hit the wall , cant think beyond it all.
def build_shops(shop_text):
shops = {}
for item_info in shop_text:
shop_id,cost,items = item_info.replace('\n', '').split(',')
cost = float(cost)
items = items.split('+')
if shop_id not in shops:
shops[shop_id] = {}
shop_dict = shops[shop_id]
for item in items:
if item not in shop_dict:
shop_dict[item] = []
shop_dict[item].append([cost,items])
return shops
def solve_one_shop(shop, items):
if len(items) == 0:
return [0.0, []]
all_possible = []
first_item = items[0]
if first_item in shop:
print "SHOP",shop.get(first_item)
for (price,combo) in shop[first_item]:
#print "items,combo=",items,combo
sub_set = [x for x in items if x not in combo]
#print "sub_set=",sub_set
price_sub_set,solution = solve_one_shop(shop, sub_set)
solution.append([price,combo])
all_possible.append([price+price_sub_set, solution])
cheapest = min(all_possible, key=(lambda x: x[0]))
return cheapest
def solver(input_data, required_items):
shops = build_shops(input_data)
#print shops
result_all_shops = []
for shop_id,shop_info in shops.iteritems():
(price, solution) = solve_one_shop(shop_info, required_items)
result_all_shops.append([shop_id, price, solution])
shop_id,total_price,solution = min(result_all_shops, key=(lambda x: x[1]))
print('SHOP_ID=%s' % shop_id)
sln_str = [','.join(items)+'(%0.2f)'%price for (price,items) in solution]
sln_str = '+'.join(sln_str)
print(sln_str + ' = %0.2f' % total_price)
shop_text = open('input-1.csv','rb')
solver(shop_text,['burger'])
=====input-1.csv=====restaurant_id, price, item
1,2.00,burger
1,1.25,tofulog
1,2.00,tofulog
1,1.00,chef_salad
1,1.00,A+B
1,1.50,A+CCC
1,2.50,A
2,3.00,A
2,1.00,B
2,1.20,CCC
2,1.25,D
=====output & error====:
{'1': {'A': [[1.0, ['A', 'B']], [1.5, ['A', 'CCC']], [2.5, ['A', 'D']]], 'B': [[1.0, ['A', 'B']]], 'D': [[2.5, ['A', 'D']]], 'chef_salad': [[1.0, ['chef_salad']]], 'burger': [[2.0, ['burger']]], 'tofulog': [[1.25, ['tofulog']], [2.0, ['tofulog']]], 'CCC': [[1.5, ['A', 'CCC']]]}, '2': {'A': [[3.0, ['A']]], 'B': [[1.0, ['B']]], 'D': [[1.25, ['D']]], 'CCC': [[1.2, ['CCC']]]}}
SHOP [[2.0, ['burger']]]
Traceback (most recent call last):
File "work.py", line 55, in <module>
solver(shop_text,['burger'])
File "work.py", line 43, in solver
(price, solution) = solve_one_shop(shop_info, required_items)
File "work.py", line 26, in solve_one_shop
for (price,combo) in shop[first_item]:
KeyError: 'burger'
whereas if I run the same code on input-2.csv , and query for solver(shop_text,['A','CCC']), I get correct result
=====input-2.csv======
1,2.00,A
1,1.25,B
1,2.00,B
1,1.00,A
1,1.00,A+B
1,1.50,A+CCC
1,2.50,A+D
2,3.00,A
2,1.00,B
2,1.20,CCC
2,1.25,D
=========output====
{'1': {'A': [[2.0, ['A']], [1.0, ['A']], [1.0, ['A', 'B']], [1.5, ['A', 'CCC']], [2.5, ['A', 'D']]], 'B': [[1.25, ['B']], [2.0, ['B']], [1.0, ['A', 'B']]], 'D': [[2.5, ['A', 'D']]], 'CCC': [[1.5, ['A', 'CCC']]]}, '2': {'A': [[3.0, ['A']]], 'B': [[1.0, ['B']]], 'D': [[1.25, ['D']]], 'CCC': [[1.2, ['CCC']]]}}
SHOP [[2.0, ['A']], [1.0, ['A']], [1.0, ['A', 'B']], [1.5, ['A', 'CCC']], [2.5, ['A', 'D']]]
SHOP [[1.5, ['A', 'CCC']]]
SHOP [[1.5, ['A', 'CCC']]]
SHOP [[1.5, ['A', 'CCC']]]
SHOP [[1.5, ['A', 'CCC']]]
SHOP [[3.0, ['A']]]
SHOP [[1.2, ['CCC']]]
SHOP_ID=1
A,CCC(1.50) = 1.50
You can figure out the error if you do this:
In your solve_one_shop method, print the dictionary shop after the line first_item = items[0]. Doing that will print out:
{'A': [[3.0, ['A']]], 'B': [[1.0, ['B']]], 'D': [[1.25, ['D']]], 'CCC': [[1.2, ['CCC']]]}
So, burger is not one of its keys and hence it throws a KeyError
Add this line:
2,1.25,burger
to the end of your input.csv file and your code works fine.
Do the reading of values from the shop dictionary in a try except block to deal with the case where an item may not be present.
Note:
In your method build_shops the line:
shop_id,cost,items = item_info.replace('\n', '').split(',')
although strips off the newline, it does not strip off the carriage return. To fix that, do this:
shop_id,cost,items = item_info.replace('\n', '').replace('\r', '').split(',')
Hope this helps.
I think I've fixed it...
solve_one_shop
The for loop should only happen within the if, otherwise you get a KeyError. Also, I have changed it so that it only returns if all_possible contains anything (an empty list evaluates to False.
edit To prevent a TypeError I have done assigned to a temporary value this_subset and the rest of the loop only happens is it is not None.
def solve_one_shop(shop, items):
if len(items) == 0:
return [0.0, []]
all_possible = []
first_item = items[0]
if first_item in shop:
for (price,combo) in shop[first_item]:
sub_set = [x for x in items if x not in combo]
this_subset = solve_one_shop(shop, sub_set)
if this_subset is not None:
price_sub_set,solution = this_subset
solution.append([price,combo])
all_possible.append([price+price_sub_set, solution])
if all_possible:
cheapest = min(all_possible, key=(lambda x: x[0]))
return cheapest
solver
I have assigned the return value of solve_one_shop to an intermediate variable. If this is None, then the shop is not added to result_all_shops.
edit If result_all_shops is empty, then print a message instead of trying to find the min.
def solver(input_data, required_items):
shops = build_shops(input_data)
result_all_shops = []
for shop_id,shop_info in shops.iteritems():
this_shop = solve_one_shop(shop_info, required_items)
if this_shop is not None:
(price, solution) = this_shop
result_all_shops.append([shop_id, price, solution])
if result_all_shops:
shop_id,total_price,solution = min(result_all_shops, key=(lambda x: x[1]))
print('SHOP_ID=%s' % shop_id)
sln_str = [','.join(items)+'(%0.2f)'%price for (price,items) in solution]
sln_str = '+'.join(sln_str)
print(sln_str + ' = %0.2f' % total_price)
else:
print "Item not available"

Categories