Find max value of a column based on another in python

Find max value of a column based on another in python - python

i have 2d list implementation as follows. It shows no. of times every student topped in exams:-
list = main_record
['student1',1]
['student2',1]
['student2',2]
['student1',5]
['student3',3]
i have another list of unique students as follows:-
list = students_enrolled
['student1','student2','student3']
which i want to display student ranking based on their distinctions as follows:-
list = student_ranking
['student1','student3','student2']
What built in functions can be useful. I could not pose proper query on net. In other words i need python equivalent of following queries:-
select max(main_record[1]) where name = student1 >>> result = 5
select max(main_record[1]) where name = student2 >>> result = 2
select max(main_record[1]) where name = student3 >>> result = 3

You define a dict base key of studentX and save the max value for each student key then sort the students_enrolled base max value of each key.
from collections import defaultdict
main_record = [['student1',1], ['student2',1], ['student2',2], ['student1',5], ['student3',3]]
students_enrolled = ['student1','student2','student3']
# defind dict with negative infinity and update with max in each iteration
tmp_dct = defaultdict(lambda: float('-inf'))
for lst in main_record:
k, v = lst
tmp_dct[k] = max(tmp_dct[k], v)
print(tmp_dct)
students_enrolled.sort(key = lambda x: tmp_dct[x], reverse=True)
print(students_enrolled)
Output:
# tmp_dct =>
defaultdict(<function <lambda> at 0x7fd81044b1f0>,
{'student1': 5, 'student2': 2, 'student3': 3})
# students_enrolled after sorting
['student1', 'student3', 'student2']

If it is a 2D list it should look like this: l = [["student1", 2], ["student2", 3], ["student3", 4]]. To get the highest numeric value from the 2nd column you can use a loop like this:
numbers = []
for student in list:
numbers.append(student[1])
for num in numbers:
n = numbers.copy()
n.sort()
n.reverse()
student_index = numbers.index(n[0])
print(list[student_index], n[0])
numbers.remove(n[0])

Related

How to get a unique value from a list in python

Imagine I have this list:
list = ['a','a','b','a']
I would like to do something like this:
print(unique(list))
to retrieve the unique item(s), so python will output b. How could I do this?

Count the items, keep only those which have a count of 1:
>>> data = ['a','a','b','a']
>>> from collections import Counter
>>> [k for k,v in Counter(data).items() if v == 1]
['b']

Using set() property of Python, we can easily check for the unique values. Insert the values of the list in a set. Set only stores a value once even if it is inserted more then once. After inserting all the values in the set by list_set=set(list1), convert this set to a list to print it.
for more ways check : https://www.geeksforgeeks.org/python-get-unique-values-list/#:~:text=Using%20set()%20property%20of,a%20list%20to%20print%20it.
Example to make a unique function :
# Python program to check if two
# to get unique values from list
# using set
# function to get unique values
def unique(list1):
# insert the list to the set
list_set = set(list1)
# convert the set to the list
unique_list = (list(list_set))
for x in unique_list:
print x,
# driver code
list1 = [10, 20, 10, 30, 40, 40]
print("the unique values from 1st list is")
unique(list1)
list2 =[1, 2, 1, 1, 3, 4, 3, 3, 5]
print("\nthe unique values from 2nd list is")
unique(list2)
output should be:
the unique values from 1st list is
40 10 20 30
the unique values from 2nd list is
1 2 3 4 5
You can also use numpy.unique example:
`
#Ppython program to check if two
# to get unique values from list
# using numpy.unique
import numpy as np
# function to get unique values
def unique(list1):
x = np.array(list1)
print(np.unique(x))
# driver code
list1 = [10, 20, 10, 30, 40, 40]
print("the unique values from 1st list is")
unique(list1)
list2 =[1, 2, 1, 1, 3, 4, 3, 3, 5]
print("\nthe unique values from 2nd list is")
unique(list2)
`
output should be :
the unique values from 1st list is
[10 20 30 40]
the unique values from 2nd list is
[1 2 3 4 5]
Fore more please check "https://www.geeksforgeeks.org/python-get-unique-values-list/#:~:text=Using%20set()%20property%20of,a%20list%20to%20print%20it."

You could use collections.Counter() for this, e.g.:
from collections import Counter
my_list = ['a','a','b','a']
counts = Counter(my_list)
for element, count in counts.items():
if count == 1:
print(element)
would print b and nothing else in this case.
Or, if you'd like to store the result in a list:
from collections import Counter
my_list = ['a','a','b','a']
counts = Counter(my_list)
unique_elements = [element for element, count in counts.items() if count == 1]
unique_elements is ['b'] in this case.

Count the appearance of an element in the list, and store it in a dictionary
From the dictionary, check which elements has only one appearance
Store the unique (one appearance) elements in a list
Print the list
You can try this:
my_list = ['a','a','b','a']
my_dict = {}
only_one = []
#Step 1
for element in my_list:
if element not in my_dict:
my_dict[element] = 1
else:
my_dict[element] += 1
#Step 2 and Step 3
for key, value in my_dict.items():
if value == 1:
only_one.append(key)
#Step 4
for unique in only_one:
print(unique)
Output:
b
In case you are wondering what other variables contain:
my_dict = {'a': 3, 'b': 1}
only_one = ['b']

Most straight forward way, use a dict of counters.
a = ['a','a','b','a']
counts = {}
for element in a:
counts[element] = counts.get(element, 0) + 1
for element in a:
if counts[element] == 1:
print(element)
out:
b

What is the most efficient algorithm in Python to find the highest and lowest values across all data lists and associated titles

In my program there are multiple quizzes. A user takes a quiz, then the title of the quiz and the score are saved to a database. For ease with the example, I'll represent them using Python lists:
[['quizTitle1', score], ['quizTitle2',score] ['quizTitle1', score] ['quizTitle3', score]]
I’m trying to print out the title of the quiz that a user is weakest on.
So, using the Python list example you see that the user has taken quiz 1 two times. On their second go they may have got a better score for the quiz than the first. So, I need to get the highest score the user has achieved with each quiz (their best score). Then I need to find which quiz has the lowest, best score.
My current plan is like this (pseudo code)
While found = false
1st = the first score selected that we are comparing with each other score
2nd = the score we are comparing to the first
For loop that repeats in the range of the number of lists
If (2nd < 1st) or (2nd has the same title and greater mark than 1st):
2nd becomes 1st
Loop repeats
Else:
New 2nd is the next list
Found = true
But what is the best way to do this?

You could use a dictionary to store the value of each quiz and update its value with maximum seen so far in your list, then get minimum of all values in the dictionary.
scores = [['q1', 20],['q2',30],['q1',40],['q2',10],['q2',45],['q1',10]]
d = {}
for s in scores:
d[s[0]] = s[1] if s[0] not in d else max(d[s[0]], s[1])
print(d)
print("Lowest best : ", min(d.values()))
This prints:
{'q1': 40, 'q2': 45}
Lowest best : 40

Well, if you are open to pandas, then:
import pandas as pd
l = [["quizTitle1", 15],
["quizTitle2", 25],
["quizTitle1", 11],
["quizTitle3", 84],
["quizTitle2", 24]]
df = pd.DataFrame(l, columns=["quiz", "score"])
print(df)
# quiz score
# 0 quizTitle1 15
# 1 quizTitle2 25
# 2 quizTitle1 11
# 3 quizTitle3 84
# 4 quizTitle2 24
lowest_score = df.iloc[df.groupby(['quiz']).max().reset_index()["score"].idxmin()]
print(lowest_score)
# quiz quizTitle1
# score 15
# Name: 0, dtype: object

A simple one:
scores = [['q1', 20],['q2',30],['q1',40],['q2',10],['q2',45],['q1',10]]
d = dict(sorted(scores))
print(min(d, key=d.get)) # prints q1
The dict function takes key/value pairs, we just need to sort them first so that each key's last value is it's largest (because the last is what ends up in the dict). After that, the desired result is simply the key with the smallest value.

A map-reduce approach:
from itertools import groupby
from operator import itemgetter
scores = [['q1', 20],['q2',30],['q1',40],['q2',10],['q2',45],['q1',10]]
name, score = itemgetter(0), itemgetter(1)
grouped_scores = groupby(sorted(scores), key=name) # group by key
highest_scores = (max(g, key=score) for _,g in grouped_scores) # reduce by key
lowest_highest = min(highest_scores, key=score) # reduce
print(lowest_highest)
Output:
['q1', 40]
Explanation
The functions used are:
sorted (docs/builtin/sorted) to sort results by quizz name
itertools.groupby (docs/itertools/groupby), which groups results by quiz assuming they are already sorted by quizz;
a generator expression, to apply a function to every element of a list... here we have a list of lists and we apply the function max to every list;
max and min (docs/builtin/min), my two "reduce" functions.
The return values of groupby and the generator expression are not lists and if you try to print them directly, you'll see a bunch of unhelpful <itertools._grouper object at 0x7ff18bbbb850>. But converting every non-printable object to a list using list(), the intermediate values computed are these:
scores = [['q1', 20],['q2',30],['q1',40],['q2',10],['q2',45],['q1',10]]
grouped_scores = [
['q1', [['q1', 10], ['q1', 20], ['q1', 40]]],
['q2', [['q2', 10], ['q2', 30], ['q2', 45]]]
]
highest_scores = [['q1', 40], ['q2', 45]]
lowest_highest = ['q1', 40]
Python's map and reduce
Two functions which can often be useful in a map-reduce algorithm:
map (docs/builtin/map), instead of the generator expression, to apply a function to every element of a list;
functools.reduce (docs/functools/reduce), to repeatedly apply a binary function to the elements in a list, two by two, and replace those two elements by the result, until there is only one element left.
In this case, we are looking for the lowest of the highest scores, so when comparing two elements we would like to keep the min of the two. But instead of applying the min() function repeatedly with reduce, in python we can call min() directly on the whole sequence.
Just for reference, here is what the code would look like if we had used reduce:
from itertools import groupby
from functools import reduce
scores = [['q1', 20],['q2',30],['q1',40],['q2',10],['q2',45],['q1',10]]
name, score = itemgetter(0), itemgetter(1)
grouped_scores = groupby(sorted(scores), key=name) # group by key
highest_scores = map(lambda x: max(x[1], key=score), grouped_scores) # reduce by key
lowest_highest = reduce(lambda x,y: min(x,y, key=score), highest_scores) # reduce
print(lowest_highest)
Output:
['q1', 40]
Using module more_itertools
Module more_itertools has a function called map_reduce which groups by key, then reduces by key. This takes care of our groupby and max steps; we only need to reduce with min and we have our result.
from more_itertools import map_reduce
from operator import itemgetter
scores = [['q1', 20],['q2',30],['q1',40],['q2',10],['q2',45],['q1',10]]
name, score = itemgetter(0), itemgetter(1)
highest_scores = map_reduce(scores, keyfunc=name, valuefunc=score, reducefunc=max)
lowest_highest = min(highest_scores.items(), key=score)
print(lowest_highest)
# ('q1', 40)

Here is a version using defaultdict, from the built-in collections module. In this case, the value of a key we haven't seen before is an empty list; we don't need to check first, we just append.
from collections import defaultdict
quizzes = defaultdict(list)
scores = [['q1', 20],['q2',30],['q1',40],['q2',10],['q2',45],['q1',10]]
# populate the dictionary of results
for score in scores:
quiznum = score[0]
result = score[1]
quizzes[quiznum].append(result) # new key? we append to empty list
quizzes
# find min score for each quiz
{ quiznum: min(scores)
for quiznum, scores in quizzes.items()
}
{'q1': 10, 'q2': 10}
The defaultdict keeps all of the scores, which is not necessary for the posted question. But it will let you determine number of attempts, high score, etc.

This is the fastest way using Python functions:
lst = [['quizTitle1', 6], ['quizTitle2', 5], ['quizTitle1', 9], ['quizTitle3', 7]]
sorted_list = sorted(lst, key=lambda x: x[1])
print(f'1st quiz: {sorted_list[-1][0]} | score: {sorted_list[-1][1]}')
print(f'last on quiz: {sorted_list[0][0]} | score: {sorted_list[0][1]}')
Basically you ask the list to be ordered and then you ask for the last value, which is the higher, and the last, which is 1st in the list. However this is not an algorithm.

In list of lists, how to find average of values associated with inner lists?

I have a list like this
l=[[Alex,12],[John,14],[Ross,24],[Alex,42],[John,24],[Alex,45]]
how should I process this list that I get a output like this
l=[[Alex,33],[John,19],[Ross,24]]
which is basically the average of scores per each name.

Use pandas to group by name and calculate mean (l is your list):
import pandas as pd
df = pd.DataFrame(l,columns=['name','value'])
l = df.groupby('name').value.mean().reset_index().values.tolist()
df:
name value
0 Alex 12
1 John 14
2 Ross 24
3 Alex 42
4 John 24
5 Alex 45
output:
[['Alex', 33], ['John', 19], ['Ross', 24]]

l = [['Alex',12],['John',14],['Ross',24],['Alex',42],['John',24],['Alex',45]]
score_dict = {}
for l_score in l:
name = l_score[0]
score = l_score[1]
if name in score_dict.keys():
score_dict[name].append(score)
else:
score_dict[name] = [score]
ret_list = []
for k, v in score_dict.items():
sum_l = sum(v)
len_l = len(v)
if len_l > 0:
avg = float(sum_l)/float(len_l)
else:
avg = 0
ret_list.append([k,avg])
print(ret_list)
this should return the following list :
[['Ross', 24.0], ['Alex', 33.0], ['John', 19.0]]
I did not use any package as there were no imports in your code sample. It can be simplified with numpy or pandas

lets simplify the problem, by constructing new dict from it, where the keys is the names or the inner lists first element and the value is the average. since keys are unique in python dicts, this become easy. after doing this we will generate a new list from the constructed dict and this will be our answer.
TheOriginalList=[[Alex,12],[John,14],[Ross,24],[Alex,42],[John,24],[Alex,45]]
aux_dict = {}
for inner_list in TheOriginalList:
if not aux_dict.get(inner_list[0],None): #_1_
aux_dict[inner_list[0]]=[inner_list[1],1] #_2_
else:
aux_dict[inner_list[0]][0]+= inner_list[1] #_3_
aux_dict[inner_list[0]][1]+= 1 #_4_
final_list = []
for k,v in aux_dict.items(): #_5_
final_list.append([k,v[0]/v[1]]) #_6_
explinations
in #1 we are trying to get the key which is the person name, if it already exist in the dict we will get its value which is a list of 2 int items [acumaltive_score , counter] and this will send us to the else to #3. if its not we enter #2
here we add the key (person name to the dict) and set its value to be new list of 2 items [current_score, 1], 1 is the first score. its a counter we need it later for average calculations.
we get here #3, because this person already exist in the dict. so we add its current score to the scores and in #4 we increments the counter by 1.
we explain it (incrementing the counter by 1)
in #5 we iterates over the dict keys and items, so we get in each iteration the key(person name) and the value (list of 2 items, the first item is the total score and the second is the number of the scores).
here in #6 we construct our final list, by appending anew list (again lis of 2 items, in the 0 index the name of the person which is the current key and in index 1 the average which is the v[0]/v[1].
take in mind that this code can raises exceptions in some cases. consider to use try-except

Maximum value between two lists, and their indexes

I have a nested dictionary with list of values, I want to have
- the maximum index wise value between two lists
- the 'id' for each max value (by id I mean from which list is the value coming and what index it is).
I already have the index wise max value between the two lists, what I need with is the 'id'.
#create dictionary:
test = {}
test['A'] = {}
test['A']['number'] = [2,2,3]
test['A']['id'] = ['x','y','z']
test['B'] = {}
test['B']['number'] = [1,3,2]
test['B']['id'] = ['a','b','c']
#this the maximum index-wise value between the two lists
max_list = [max(*l) for l in zip(test['A']['number'], test['B']['number'])]
print(max_list)
What I would like is another list with the following:
['x','b','z']

make an inner zip of id and number so we know which id belongs to which number,
then use max with a custom key function (by number), then split them:
test = {}
test['A'] = {}
test['A']['number'] = [2,2,3]
test['A']['id'] = ['x','y','z']
test['B'] = {}
test['B']['number'] = [1,3,2]
test['B']['id'] = ['a','b','c']
tuple_list = [max(*l, key=lambda t: t[1]) for l in zip(zip(test['A']['id'],test['A']['number']), zip(test['B']['id'],test['B']['number']))]
max_num_list = [t[1] for t in tuple_list]
max_id_list = [t[0] for t in tuple_list]
print(max_num_list)
print(max_id_list)
Output:
[2, 3, 3]
['x', 'b', 'z']

Python - sorting a list of numbers based on indexes

I need to create a program that has a class that crates an object "Food" and a list called "fridge" that holds these objects created by class "Food".
class Food:
def __init__(self, name, expiration):
self.name = name
self.expiration = expiration
fridge = [Food("beer",4), Food("steak",1), Food("hamburger",1), Food("donut",3),]
This was not hard. Then i created an function, that gives you a food with highest expiration number.
def exp(fridge):
expList=[]
xen = 0
for i in range(0,len(fridge)):
expList.append(fridge[xen].expiration)
xen += 1
print(expList)
sortedList = sorted(expList)
return sortedList.pop()
exp(fridge)
This one works too, now i have to create a function that returns a list where the index of the list is the expiration date and the number of that index is number of food with that expiration date.
The output should look like: [0,2,1,1] - first index 0 means that there is no food with expiration date "0". Index 1 means that there are 2 pieces of food with expiration days left 1. And so on. I got stuck with too many if lines and i cant get this one to work at all. How should i approach this ? Thanks for the help.

In order to return it as a list, you will first need to figure out the maximum expiration date in the fridge.
max_expiration = max(food.expiration for food in fridge) +1 # need +1 since 0 is also a possible expiration
exp_list = [0] * max_expiration
for food in fridge:
exp_list[food.expiration] += 1
print(exp_list)
returns [0, 2, 0, 1, 1]

You can iterate on the list of Food objects and update a dictionary keyed on expiration, with the values as number of items having that expiration. Avoid redundancy such as keeping zero counts in a list by using a collections.Counter object (a subclass of dict):
from collections import Counter
d = Counter(food.expiration for food in fridge)
# fetch number of food with expiration 0
print(d[0]) # -> 0
# fetch number of food with expiration 1
print(d[1]) # -> 2

You can use itertools.groupby to create a dict where key will be the food expiration date and value will be the number of times it occurs in the list
>>> from itertools import groupby
>>> fridge = [Food("beer",4), Food("steak",1), Food("hamburger",1), Food("donut",3),]
>>> d = dict((k,len(list(v))) for k,v in groupby(sorted(l,key=lambda x: x.expiration), key=lambda x: x.expiration))
Here we specify groupby to group all elements of list that have same expiration(Note the key argument in groupby). The output of groupby operation is roughly equivalent to (k,[v]), where k is the group key and [v] is the list of values belong to that particular group.
This will produce output like this:
>>> d
>>> {1: 2, 3: 1, 4: 1}
At this point we have expiration and number of times a particular expiration occurs in a list, stored in a dict d.
Next we need to create a list such that If an element is present in the dict d output it, else output 0. We need to iterate from 0 till max number in dict d keys. To do this we can do:
>>> [0 if not d.get(x) else d.get(x) for x in range(0, max(d.keys())+1)]
This will yield your required output
>>> [0,2,0,1,1]

Here is a flexible method using collections.defaultdict:
from collections import defaultdict
def ReverseDictionary(input_dict):
reversed_dict = defaultdict(set)
for k, v in input_dict.items():
reversed_dict[v].add(k)
return reversed_dict
fridge_dict = {f.name: f.expiration for f in fridge}
exp_food = ReverseDictionary(fridge_dict)
# defaultdict(set, {1: {'hamburger', 'steak'}, 3: {'donut'}, 4: {'beer'}})
exp_count = {k: len(exp_food.get(k, set())) for k in range(max(exp_food)+1)}
# {0: 0, 1: 2, 2: 0, 3: 1, 4: 1}

Modify yours with count().
def exp(fridge):
output = []
exp_list = [i.expiration for i in fridge]
for i in range(0, max(exp_list)+1):
output.append(exp_list.count(i))
return output

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Find max value of a column based on another in python - python

Related

How to get a unique value from a list in python

What is the most efficient algorithm in Python to find the highest and lowest values across all data lists and associated titles

In list of lists, how to find average of values associated with inner lists?

Maximum value between two lists, and their indexes

Python - sorting a list of numbers based on indexes

Categories

Resources