How to optimize this Python code?

How to optimize this Python code? - python

def maxVote(nLabels):
count = {}
maxList = []
maxCount = 0
for nLabel in nLabels:
if nLabel in count:
count[nLabel] += 1
else:
count[nLabel] = 1
#Check if the count is max
if count[nLabel] > maxCount:
maxCount = count[nLabel]
maxList = [nLabel,]
elif count[nLabel]==maxCount:
maxList.append(nLabel)
return random.choice(maxList)
nLabels contains a list of integers.
The above function returns the integer with highest frequency, if more than one have same frequency then a randomly selected integer from them is returned.
E.g. maxVote([1,3,4,5,5,5,3,12,11]) is 5

import random
import collections
def maxvote(nlabels):
cnt = collections.defaultdict(int)
for i in nlabels:
cnt[i] += 1
maxv = max(cnt.itervalues())
return random.choice([k for k,v in cnt.iteritems() if v == maxv])
print maxvote([1,3,4,5,5,5,3,3,11])

In Python 3.1 or future 2.7 you'd be able to use Counter:
>>> from collections import Counter
>>> Counter([1,3,4,5,5,5,3,12,11]).most_common(1)
[(5, 3)]
If you don't have access to those versions of Python you could do:
>>> from collections import defaultdict
>>> d = defaultdict(int)
>>> for i in nLabels:
d[i] += 1
>>> max(d, key=lambda x: d[x])
5

It appears to run in O(n) time. However there may be a bottleneck in checking if nLabel in count since this operation could also potentially run O(n) time as well, making the total efficiency O(n^2).
Using a dictionary instead of a list in this case is the only major efficiency boost I can spot.

I'm not sure what exactly you want to optimize, but this should work:
from collections import defaultdict
def maxVote(nLabels):
count = defaultdict(int)
for nLabel in nLabels:
count[nLabel] += 1
maxCount = max(count.itervalues())
maxList = [k for k in count if count[k] == maxCount]
return random.choice(maxList)

Idea 1
Does the return really need to be random, or can you just return a maximum? If you just need to nondeterministically return a max frequency, you could just store a single label and remove the list logic, including
elif count[nLabel]==maxCount:
maxList.append(nLabel)
Idea 2
If this method is called frequently, would it be possible to only work on new data, as opposed to the entire data set? You could cache your count map and then only process new data. Assuming your data set is large and the calculations are done online, this could net huge improvements.

Complete example:
#!/usr/bin/env python
def max_vote(l):
"""
Return the element with the (or a) maximum frequency in ``l``.
"""
unsorted = [(a, l.count(a)) for a in set(l)]
return sorted(unsorted, key=lambda x: x[1]).pop()[0]
if __name__ == '__main__':
votes = [1, 3, 4, 5, 5, 5, 3, 12, 11]
print max_vote(votes)
# => 5
Benchmarks:
#!/usr/bin/env python
import random
import collections
def max_vote_2(l):
"""
Return the element with the (or a) maximum frequency in ``l``.
"""
unsorted = [(a, l.count(a)) for a in set(l)]
return sorted(unsorted, key=lambda x: x[1]).pop()[0]
def max_vote_1(nlabels):
cnt = collections.defaultdict(int)
for i in nlabels:
cnt[i] += 1
maxv = max(cnt.itervalues())
return random.choice([k for k,v in cnt.iteritems() if v == maxv])
if __name__ == '__main__':
from timeit import Timer
votes = [1, 3, 4, 5, 5, 5, 3, 12, 11]
print max_vote_1(votes)
print max_vote_2(votes)
t = Timer("votes = [1, 3, 4, 5, 5, 5, 3, 12, 11]; max_vote_2(votes)", \
"from __main__ import max_vote_2")
print 'max_vote_2', t.timeit(number=100000)
t = Timer("votes = [1, 3, 4, 5, 5, 5, 3, 12, 11]; max_vote_1(votes)", \
"from __main__ import max_vote_1")
print 'max_vote_1', t.timeit(number=100000)
Yields:
5
5
max_vote_2 1.79455208778
max_vote_1 2.31705093384

Related

How to create a dict from list where values are elements of the list and keys are function of those elements in python?

I have a list of caller_address elements. For each of these addresses I can get a caller_function, a function containing that caller_address. In a single function there may be more than 1 address.
So if I have a list of caller_address elements:
caller_addresses = [1, 2, 3, 4, 5, 6, 7, 8]
For each of them I can get a function:
caller_functions = [getFunctionContaining(addr) for addr in caller_addresses]
print(caller_functions)
# prints(example): ['func1', 'func1', 'func2', 'func2', 'func2', 'func2', 'func3', 'func3']
In the result I need to get a dict where keys are the functions and values are lists of addresses those functions contain. In my example in must be:
{'func1': [1, 2], 'func2': [3, 4, 5, 6], 'func3': [7, 8]}
# Means 'func1' contains addresses 1 and 2, 'func2' contains 3, 4, 5 and 6, ...
It would be great if there was a function like:
result = to_dict(lambda addr: getFunctionContaining(addr), caller_addresses)
to get the same result.
Where the first argument is the function for keys and the second argument is the list of values. Is there such function in standard library in python?
I could implement it with for loop and dict[getFunctionContaining(addr)].append(addr), but I'm looking for more pythonic way to do this.
Thanks!

Found a solution using itertools.groupby.
This solution is also faster than a solution using a loop.
import itertools
import time
def f(v):
if v < 5:
return 1
if v < 7:
return 2
return 3
def to_dict(key, list_):
out = {}
for el in list_:
out.setdefault(key(el), []).append(el)
return out
def to_dict2(key, list_):
return {k: list(v) for k, v in itertools.groupby(list_, key)}
lst = [1, 2, 3, 4, 5, 6, 7, 8] * 10**4
COUNT = 1000
def timeit(to_dict_f):
elapsed_sum = 0
for _ in range(COUNT):
elapsed_sum -= time.time()
to_dict_f(f, lst)
elapsed_sum += time.time()
return elapsed_sum / COUNT
print('Average time: ', timeit(to_dict), timeit(to_dict2))
Results:
Average time: 0.014930561065673828 0.01346096110343933
to_dict2 (itertools.groupby) on average takes less time than to_dict (loop)

Loops and Dictionary

How to loop in a list while using dictionaries and return the value that repeats the most, and if the values are repeated the same amount return that which is greater?
Here some context with code unfinished
def most_frequent(lst):
dict = {}
count, itm = 0, ''
for item in lst:
dict[item] = dict.get(item, 0) + 1
if dict[item] >= count:
count, itm = dict[item], item
return itm
#lst = ["a","b","b","c","a","c"]
lst = [2, 3, 2, 2, 1, 3, 3,1,1,1,1] #this should return 1
lst2 = [2, 3, 2, 2, 1, 3, 3] # should return 3
print(most_frequent(lst))

Here is a different way to go about it:
def most_frequent(lst):
# Simple check to ensure lst has something.
if not lst:
return -1
# Organize your data as: {number: count, ...}
dct = {}
for i in lst:
dct[i] = dct[i] + 1 if i in dct else 1
# Iterate through your data and create a list of all large elements.
large_list, large_count = [], 0
for num, count in dct.items():
if count > large_count:
large_count = count
large_list = [num]
elif count == large_count:
large_list.append(num)
# Return the largest element in the large_list list.
return max(large_list)
There are many other ways to solve this problem, including using filter and other built-ins, but this is intended to give you a working solution so that you can start thinking on how to possibly optimize it better.
Things to take out of this; always think:
How can I break this problem down into smaller parts?
How can I organize my data so that it is more useful and easier to manipulate?
What shortcuts can I use along the way to make this function easier/better/faster?

Your code produces the result as you describe in your question, i.e. 1. However, your question states that you want to consider the case where two list elements are co-equals in maximum occurrence and return the largest. Therefore, tracking and returning a single element doesn't satisfy this requirement. You need to compile the dict and then evaluate the result.
def most_frequent(lst):
dict = {}
for item in lst:
dict[item] = dict.get(item, 0) + 1
itm = sorted(dict.items(), key = lambda kv:(-kv[1], -kv[0]))
return itm[0]
#lst = ["a","b","b","c","a","c"]
lst = [2, 3, 2, 2, 2, 2, 1, 3, 3,1,1,1,1] #this should return 1
lst2 = [2, 3, 2, 2, 1, 3, 3] # should return 3
print(most_frequent(lst))
I edited the list 'lst' so that '1' and '2' both occur 5 times. The result returned is a tuple:
(2,5)

I reuse your idea which is quite neat, and I just modified your program a bit.
def get_most_frequent(lst):
counts = dict()
most_frequent = (None, 0) # (item, count)
ITEM_IDX = 0
COUNT_IDX = 1
for item in lst:
counts[item] = counts.get(item, 0) + 1
if most_frequent[ITEM_IDX] is None:
# first loop, most_frequent is "None"
most_frequent = (item, counts[item])
elif counts[item] > most_frequent[COUNT_IDX]:
# if current item's "counter" is bigger than the most_frequent's counter
most_frequent = (item, counts[item])
elif counts[item] == most_frequent[COUNT_IDX] and item > most_frequent[ITEM_IDX]:
# if the current item's "counter" is the same as the most_frequent's counter
most_frequent = (item, counts[item])
else:
pass # do nothing
return most_frequent
lst1 = [2, 3, 2, 2, 1, 3, 3,1,1,1,1, 2] # 1: 5 times
lst2 = [2, 3, 1, 3, 3, 2, 2] # 3: 3 times
lst3 = [1]
lst4 = []
print(get_most_frequent(lst1))
print(get_most_frequent(lst2))
print(get_most_frequent(lst3))
print(get_most_frequent(lst4))

Building a custom Counter function without using built-ins

I have this code:
L = [1, 4, 7, 5, 5, 4, 5, 1, 1, 1]
def frequency(L):
counter = 0
number = L[0]
for i in L:
amount_times = L.count(i)
if amount_times > counter:
counter = amount_times
number = i
return number
print(frequency(L))
But I don't want to use counter function. I want to make code run without any built-in functions. How can I do this?

If you really want to reinvent collections.Counter, this is possible with and without list.count. However, I see no rationale.
Using list.count, you can use a dictionary comprehension. This is inefficient as the list is passed once for each variable.
def frequency2(L):
return {i: L.count(i) for i in set(L)}
If you do not wish to use list.count, this is possible using if / else:
def frequency3(L):
d = {}
for i in L:
if i in d:
d[i] += 1
else:
d[i] = 0
return d
Then to extract the highest count(s):
maxval = max(d.values())
res = [k for k, v in d.items() if v == maxval]

You could try this one. Not sure if this one is acceptable to you.
This finds the most frequent item in a list without using built-ins:
L = [1, 4, 7, 5, 5, 4, 5, 1, 1, 1]
def frequency(L):
count, item = 0, ''
d = {i:0 for i in L}
for i in L[::-1]:
d[i] = d[i] + 1
if d[i] >= count :
count = d[i]
item = i
return item
print(frequency(L))
# 1

duplicates function, adding

Hello I am trying to develop a function to find duplicates in a list. Below is the code that I have obtained thus far. I am cannot seem to figure out how to get the code to correctly add the number of duplicated numbers.
import collections
myList = [5, 9, 14, 5, 2, 5, 1]
def find_duplicates(aList, target):
if target not in aList:
print (target, "occurred 0 times")
else:
n=0
print (target, "occurred",n+1,"times")
the output of the code shows:
find_duplicates(myList, 5)
5 occurred 1 times
Obviously I am missing something for the program to properly track how many times the value occurs? Can someone please help?
I am not allowed to use the count() or sort() built in functions.

To only count the number of duplicates, just iterate over the list, comparing each value. If you find a match, increment a counter, than report the counter. To make this better, I would return the count then print to console outside of the def.
import collections
def find_duplicates(aList, target):
n = 0
for obj in aList:
if obj is target:
n += 1
return n
myList = [5, 9, 14, 5, 2, 5, 1]
target = 5
num_dup = find_duplicates(myList, target)
print (target, "occurred", num_dup, "times")
This should echo out:
5 occurred 3 times
Or do this (with list.count(x)):
myList = [5, 9, 14, 5, 2, 5, 1]
target = 5
num_dup = myList.count(target)
print (target, "occurred", num_dup, "times")
This should echo out:
5 occurred 3 times

You forgot to increment n in your code, so it always print 1. I think your code should look like:
import collections
myList = [5, 9, 14, 5, 2, 5, 1]
def find_duplicates(aList, target):
if target not in aList:
print (target, "occurred 0 times")
else:
n= aList.count(5)
print (target, "occurred",n,"times")
without using count and reading the target from shell:
import collections
myList = [5, 9, 14, 5, 2, 5, 2]
def find_duplicates(aList, target):
result = 0
for item in aList:
if item == target:
result += 1
return result
try:
target = int(raw_input("Choose a number to find duplicates: ")) # for python 3.X use input instead of raw_input
res = find_duplicates(myList, target)
print (target, " occurred ", res, " times")
except:
print("Write a number, not anything else")
This works for integers, if you want to use floats, just change int(...) for float(...)

it's a simple case of using a dictionary. Check the following code:
def frequency(l):
counter = {}
for x in l:
counter[x] = counter.get(x, 0) + 1
return counter
It will iterate over the list, saving each element as a key to the counter dictionary. Note the special form counter.get(x, 0), it will return the value of counter[x] if x is already on the dict, else it will return zero.
Checking the results is a matter of using:
print(frequency(myList))
>>> {9: 1, 2: 1, 5: 3, 14: 1, 1: 1}
You can get the number of appearances of any member by inspecting the dictionary:
frq = frequency(myList)
print(frq[14])
>>> 1
print(frq[1])
>>> 1
Of course it's possible to write a wrapper:
def target_frequencty(target, my_list):
frq = frequencty(my_list)
return frq.get(target, 0)
Enjoy.

Find the item with maximum occurrences in a list [duplicate]

This question already has answers here:
Find the most common element in a list
(27 answers)
Closed 2 years ago.
In Python, I have a list:
L = [1, 2, 45, 55, 5, 4, 4, 4, 4, 4, 4, 5456, 56, 6, 7, 67]
I want to identify the item that occurred the highest number of times. I am able to solve it but I need the fastest way to do so. I know there is a nice Pythonic answer to this.

I am surprised no-one has mentioned the simplest solution,max() with the key list.count:
max(lst,key=lst.count)
Example:
>>> lst = [1, 2, 45, 55, 5, 4, 4, 4, 4, 4, 4, 5456, 56, 6, 7, 67]
>>> max(lst,key=lst.count)
4
This works in Python 3 or 2, but note that it only returns the most frequent item and not also the frequency. Also, in the case of a draw (i.e. joint most frequent item) only a single item is returned.
Although the time complexity of using max() is worse than using Counter.most_common(1) as PM 2Ring comments, the approach benefits from a rapid C implementation and I find this approach is fastest for short lists but slower for larger ones (Python 3.6 timings shown in IPython 5.3):
In [1]: from collections import Counter
...:
...: def f1(lst):
...: return max(lst, key = lst.count)
...:
...: def f2(lst):
...: return Counter(lst).most_common(1)
...:
...: lst0 = [1,2,3,4,3]
...: lst1 = lst0[:] * 100
...:
In [2]: %timeit -n 10 f1(lst0)
10 loops, best of 3: 3.32 us per loop
In [3]: %timeit -n 10 f2(lst0)
10 loops, best of 3: 26 us per loop
In [4]: %timeit -n 10 f1(lst1)
10 loops, best of 3: 4.04 ms per loop
In [5]: %timeit -n 10 f2(lst1)
10 loops, best of 3: 75.6 us per loop

from collections import Counter
most_common,num_most_common = Counter(L).most_common(1)[0] # 4, 6 times
For older Python versions (< 2.7), you can use this recipe to create the Counter class.

In your question, you asked for the fastest way to do it. As has been demonstrated repeatedly, particularly with Python, intuition is not a reliable guide: you need to measure.
Here's a simple test of several different implementations:
import sys
from collections import Counter, defaultdict
from itertools import groupby
from operator import itemgetter
from timeit import timeit
L = [1,2,45,55,5,4,4,4,4,4,4,5456,56,6,7,67]
def max_occurrences_1a(seq=L):
"dict iteritems"
c = dict()
for item in seq:
c[item] = c.get(item, 0) + 1
return max(c.iteritems(), key=itemgetter(1))
def max_occurrences_1b(seq=L):
"dict items"
c = dict()
for item in seq:
c[item] = c.get(item, 0) + 1
return max(c.items(), key=itemgetter(1))
def max_occurrences_2(seq=L):
"defaultdict iteritems"
c = defaultdict(int)
for item in seq:
c[item] += 1
return max(c.iteritems(), key=itemgetter(1))
def max_occurrences_3a(seq=L):
"sort groupby generator expression"
return max(((k, sum(1 for i in g)) for k, g in groupby(sorted(seq))), key=itemgetter(1))
def max_occurrences_3b(seq=L):
"sort groupby list comprehension"
return max([(k, sum(1 for i in g)) for k, g in groupby(sorted(seq))], key=itemgetter(1))
def max_occurrences_4(seq=L):
"counter"
return Counter(L).most_common(1)[0]
versions = [max_occurrences_1a, max_occurrences_1b, max_occurrences_2, max_occurrences_3a, max_occurrences_3b, max_occurrences_4]
print sys.version, "\n"
for vers in versions:
print vers.__doc__, vers(), timeit(vers, number=20000)
The results on my machine:
2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
dict iteritems (4, 6) 0.202214956284
dict items (4, 6) 0.208412885666
defaultdict iteritems (4, 6) 0.221301078796
sort groupby generator expression (4, 6) 0.383440971375
sort groupby list comprehension (4, 6) 0.402786016464
counter (4, 6) 0.564319133759
So it appears that the Counter solution is not the fastest. And, in this case at least, groupby is faster. defaultdict is good but you pay a little bit for its convenience; it's slightly faster to use a regular dict with a get.
What happens if the list is much bigger? Adding L *= 10000 to the test above and reducing the repeat count to 200:
dict iteritems (4, 60000) 10.3451900482
dict items (4, 60000) 10.2988479137
defaultdict iteritems (4, 60000) 5.52838587761
sort groupby generator expression (4, 60000) 11.9538850784
sort groupby list comprehension (4, 60000) 12.1327362061
counter (4, 60000) 14.7495789528
Now defaultdict is the clear winner. So perhaps the cost of the 'get' method and the loss of the inplace add adds up (an examination of the generated code is left as an exercise).
But with the modified test data, the number of unique item values did not change so presumably dict and defaultdict have an advantage there over the other implementations. So what happens if we use the bigger list but substantially increase the number of unique items? Replacing the initialization of L with:
LL = [1,2,45,55,5,4,4,4,4,4,4,5456,56,6,7,67]
L = []
for i in xrange(1,10001):
L.extend(l * i for l in LL)
dict iteritems (2520, 13) 17.9935798645
dict items (2520, 13) 21.8974409103
defaultdict iteritems (2520, 13) 16.8289561272
sort groupby generator expression (2520, 13) 33.853593111
sort groupby list comprehension (2520, 13) 36.1303369999
counter (2520, 13) 22.626899004
So now Counter is clearly faster than the groupby solutions but still slower than the iteritems versions of dict and defaultdict.
The point of these examples isn't to produce an optimal solution. The point is that there often isn't one optimal general solution. Plus there are other performance criteria. The memory requirements will differ substantially among the solutions and, as the size of the input goes up, memory requirements may become the overriding factor in algorithm selection.
Bottom line: it all depends and you need to measure.

Here is a defaultdict solution that will work with Python versions 2.5 and above:
from collections import defaultdict
L = [1,2,45,55,5,4,4,4,4,4,4,5456,56,6,7,67]
d = defaultdict(int)
for i in L:
d[i] += 1
result = max(d.iteritems(), key=lambda x: x[1])
print result
# (4, 6)
# The number 4 occurs 6 times
Note if L = [1, 2, 45, 55, 5, 4, 4, 4, 4, 4, 4, 5456, 7, 7, 7, 7, 7, 56, 6, 7, 67]
then there are six 4s and six 7s. However, the result will be (4, 6) i.e. six 4s.

If you're using Python 3.8 or above, you can use either statistics.mode() to return the first mode encountered or statistics.multimode() to return all the modes.
>>> import statistics
>>> data = [1, 2, 2, 3, 3, 4]
>>> statistics.mode(data)
2
>>> statistics.multimode(data)
[2, 3]
If the list is empty, statistics.mode() throws a statistics.StatisticsError and statistics.multimode() returns an empty list.
Note before Python 3.8, statistics.mode() (introduced in 3.4) would additionally throw a statistics.StatisticsError if there is not exactly one most common value.

A simple way without any libraries or sets
def mcount(l):
n = [] #To store count of each elements
for x in l:
count = 0
for i in range(len(l)):
if x == l[i]:
count+=1
n.append(count)
a = max(n) #largest in counts list
for i in range(len(n)):
if n[i] == a:
return(l[i],a) #element,frequency
return #if something goes wrong

Perhaps the most_common() method

I obtained the best results with groupby from itertools module with this function using Python 3.5.2:
from itertools import groupby
a = [1, 2, 45, 55, 5, 4, 4, 4, 4, 4, 4, 5456, 56, 6, 7, 67]
def occurrence():
occurrence, num_times = 0, 0
for key, values in groupby(a, lambda x : x):
val = len(list(values))
if val >= occurrence:
occurrence, num_times = key, val
return occurrence, num_times
occurrence, num_times = occurrence()
print("%d occurred %d times which is the highest number of times" % (occurrence, num_times))
Output:
4 occurred 6 times which is the highest number of times
Test with timeit from timeit module.
I used this script for my test with number= 20000:
from itertools import groupby
def occurrence():
a = [1, 2, 45, 55, 5, 4, 4, 4, 4, 4, 4, 5456, 56, 6, 7, 67]
occurrence, num_times = 0, 0
for key, values in groupby(a, lambda x : x):
val = len(list(values))
if val >= occurrence:
occurrence, num_times = key, val
return occurrence, num_times
if __name__ == '__main__':
from timeit import timeit
print(timeit("occurrence()", setup = "from __main__ import occurrence", number = 20000))
Output (The best one):
0.1893607140000313

I want to throw in another solution that looks nice and is fast for short lists.
def mc(seq=L):
"max/count"
max_element = max(seq, key=seq.count)
return (max_element, seq.count(max_element))
You can benchmark this with the code provided by Ned Deily which will give you these results for the smallest test case:
3.5.2 (default, Nov 7 2016, 11:31:36)
[GCC 6.2.1 20160830]
dict iteritems (4, 6) 0.2069783889998289
dict items (4, 6) 0.20462976200065896
defaultdict iteritems (4, 6) 0.2095775119996688
sort groupby generator expression (4, 6) 0.4473949929997616
sort groupby list comprehension (4, 6) 0.4367636879997008
counter (4, 6) 0.3618192010007988
max/count (4, 6) 0.20328268999946886
But beware, it is inefficient and thus gets really slow for large lists!

Simple and best code:
def max_occ(lst,x):
count=0
for i in lst:
if (i==x):
count=count+1
return count
lst=[1, 2, 45, 55, 5, 4, 4, 4, 4, 4, 4, 5456, 56, 6, 7, 67]
x=max(lst,key=lst.count)
print(x,"occurs ",max_occ(lst,x),"times")
Output: 4 occurs 6 times

My (simply) code (three months studying Python):
def more_frequent_item(lst):
new_lst = []
times = 0
for item in lst:
count_num = lst.count(item)
new_lst.append(count_num)
times = max(new_lst)
key = max(lst, key=lst.count)
print("In the list: ")
print(lst)
print("The most frequent item is " + str(key) + ". Appears " + str(times) + " times in this list.")
more_frequent_item([1, 2, 45, 55, 5, 4, 4, 4, 4, 4, 4, 5456, 56, 6, 7, 67])
The output will be:
In the list:
[1, 2, 45, 55, 5, 4, 4, 4, 4, 4, 4, 5456, 56, 6, 7, 67]
The most frequent item is 4. Appears 6 times in this list.

if you are using numpy in your solution for faster computation use this:
import numpy as np
x = np.array([2,5,77,77,77,77,77,77,77,9,0,3,3,3,3,3])
y = np.bincount(x,minlength = max(x))
y = np.argmax(y)
print(y) #outputs 77

Following is the solution which I came up with if there are multiple characters in the string all having the highest frequency.
mystr = input("enter string: ")
#define dictionary to store characters and their frequencies
mydict = {}
#get the unique characters
unique_chars = sorted(set(mystr),key = mystr.index)
#store the characters and their respective frequencies in the dictionary
for c in unique_chars:
ctr = 0
for d in mystr:
if d != " " and d == c:
ctr = ctr + 1
mydict[c] = ctr
print(mydict)
#store the maximum frequency
max_freq = max(mydict.values())
print("the highest frequency of occurence: ",max_freq)
#print all characters with highest frequency
print("the characters are:")
for k,v in mydict.items():
if v == max_freq:
print(k)
Input: "hello people"
Output:
{'o': 2, 'p': 2, 'h': 1, ' ': 0, 'e': 3, 'l': 3}
the highest frequency of occurence: 3
the characters are:
e
l

may something like this:
testList = [1, 2, 3, 4, 2, 2, 1, 4, 4]
print(max(set(testList), key = testList.count))

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to optimize this Python code? - python

import random import collections def maxvote(nlabels): cnt = collections.defaultdict(int) for i in nlabels: cnt[i] += 1 maxv = max(cnt.itervalues()) return random.choice([k for k,v in cnt.iteritems() if v == maxv]) print maxvote([1,3,4,5,5,5,3,3,11])

It appears to run in O(n) time. However there may be a bottleneck in checking if nLabel in count since this operation could also potentially run O(n) time as well, making the total efficiency O(n^2). Using a dictionary instead of a list in this case is the only major efficiency boost I can spot.

Related

How to create a dict from list where values are elements of the list and keys are function of those elements in python?

Loops and Dictionary

Building a custom Counter function without using built-ins

duplicates function, adding

Find the item with maximum occurrences in a list [duplicate]

Categories

Resources