Related
P.S: Thank you everybody ,esp Matthias Fripp . Just reviewed the question You are right I made mistake : String is value not the key
num=[1,2,3,4,5,6]
pow=[1,4,9,16,25,36]
s= ":subtraction"
dic={1:1 ,0:s , 2:4,2:s, 3:9,6:s, 4:16,12:s.......}
There is easy way to convert two list to dictionary :
newdic=dict(zip(list1,list2))
but for this problem no clue even with comprehension:
print({num[i]:pow[i] for i in range(len(num))})
As others have said, dict cannot contain duplicate keys. You can make key duplicate with a little bit of tweaking. I used OrderedDict to keep order of inserted keys:
from pprint import pprint
from collections import OrderedDict
num=[1,2,3,4,5,6]
pow=[1,4,9,16,25,36]
pprint(OrderedDict(sum([[[a, b], ['substraction ({}-{}):'.format(a, b), a-b]] for a, b in zip(num, pow)], [])))
Prints:
OrderedDict([(1, 1),
('substraction (1-1):', 0),
(2, 4),
('substraction (2-4):', -2),
(3, 9),
('substraction (3-9):', -6),
(4, 16),
('substraction (4-16):', -12),
(5, 25),
('substraction (5-25):', -20),
(6, 36),
('substraction (6-36):', -30)])
In principle, this would do what you want:
nums = [(n, p) for (n, p) in zip(num, pow)]
diffs = [('subtraction', p-n) for (n, p) in zip(num, pow)]
items = nums + diffs
dic = dict(items)
However, a dictionary cannot have multiple items with the same key, so each of your "subtraction" items will be replaced by the next one added to the dictionary, and you'll only get the last one. So you might prefer to work with the items list directly.
If you need the items list sorted as you've shown, that will take a little more work. Maybe something like this:
items = []
for n, p in zip(num, pow):
items.append((n, p))
items.append(('subtraction', p-n))
# the next line will drop most 'subtraction' entries, but on
# Python 3.7+, it will at least preserve the order (not possible
# with earlier versions of Python)
dic = dict(items)
Python Collection Counter.most_common(n) method returns the top n elements with their counts. However, if the counts for two elements is the same, how can I return the result sorted by alphabetical order?
For example: for a string like: BBBAAACCD, for the "2-most common" elements, I want the result to be for specified n = 2:
[('A', 3), ('B', 3), ('C', 2)]
and NOT:
[('B', 3), ('A', 3), ('C', 2)]
Notice that although A and B have the same frequency, A comes before B in the resultant list since it comes before B in alphabetical order.
[('A', 3), ('B', 3), ('C', 2)]
How can I achieve that?
Although this question is already a bit old i'd like to suggest a very simple solution to the problem which just involves sorting the input of Counter() before creating the Counter object itself. If you then call most_common(n) you will get the top n entries sorted in alphabetical order.
from collections import Counter
char_counter = Counter(sorted('ccccbbbbdaef'))
for char in char_counter.most_common(3):
print(*char)
resulting in the output:
b 4
c 4
a 1
There are two issues here:
Include duplicates when considering top n most common values excluding duplicates.
For any duplicates, order alphabetically.
None of the solutions thus far address the first issue. You can use a heap queue with the itertools unique_everseen recipe (also available in 3rd party libraries such as toolz.unique) to calculate the nth largest count.
Then use sorted with a custom key.
from collections import Counter
from heapq import nlargest
from toolz import unique
x = 'BBBAAACCD'
c = Counter(x)
n = 2
nth_largest = nlargest(n, unique(c.values()))[-1]
def sort_key(x):
return -x[1], x[0]
gen = ((k, v) for k, v in c.items() if v >= nth_largest)
res = sorted(gen, key=sort_key)
[('A', 3), ('B', 3), ('C', 2)]
I would first sort your output array in alphabetical order and than sort again by most occurrences which will keep the alphabetical order:
from collections import Counter
alphabetic_sorted = sorted(Counter('BBBAAACCD').most_common(), key=lambda tup: tup[0])
final_sorted = sorted(alphabetic_sorted, key=lambda tup: tup[1], reverse=True)
print(final_sorted[:3])
Output:
[('A', 3), ('B', 3), ('C', 2)]
I would go for:
sorted(Counter('AAABBBCCD').most_common(), key=lambda t: (-t[1], t[0]))
This sorts count descending (as they are already, which should be more performant) and then sorts by name ascending in each equal count group
This is one of the problems I got in the interview exam and failed to do it. Came home slept for a while and solution came in my mind.
from collections import Counter
def bags(list):
cnt = Counter(list)
print(cnt)
order = sorted(cnt.most_common(2), key=lambda i:( i[1],i[0]), reverse=True)
print(order)
return order[0][0]
print(bags(['a','b','c','a','b']))
s = "BBBAAACCD"
p = [(i,s.count(i)) for i in sorted(set(s))]
**If you are okay with not using the Counter.
from collections import Counter
s = 'qqweertyuiopasdfghjklzxcvbnm'
s_list = list(s)
elements = Counter(s_list).most_common()
print(elements)
alphabet_sort = sorted(elements, key=lambda x: x[0])
print(alphabet_sort)
num_sort = sorted(alphabet_sort, key=lambda x: x[1], reverse=True)
print(num_sort)
if you need to get slice:
print(num_sort[:3])
from collections import Counter
print(sorted(Counter('AAABBBCCD').most_common(3)))
This question seems to be a duplicate
How to sort Counter by value? - python
I have a list of (str,int) pairs
list_word = [('AND', 1), ('BECAUSE', 1), ('OF', 1), ('AFRIAD', 1), ('NEVER', 1), ('CATS', 2), ('ARE', 2), ('FRIENDS', 1), ('DOGS', 2)]
This basically says how many times each word showed up in a text.
What I want to get is the set of words with maximum occurrence along with maximum occurrence number. So, in the above example, I want to get
(set(['CATS', 'DOGS','ARE']), 2)
The solution I can think of is looping through the list. But is there any elegant way of doing this?
Two linear scans, first to find the maximal element:
maxcount = max(map(itemgetter(1), mylist))
then a second to pull out the values you care about:
maxset = {word for word, count in mylist if count == maxcount}, maxcount
If you needed to get the sets for more than just the maximal count, you can use collections.defaultdict to accumulate by count in a single pass:
from collections import defaultdict
sets_by_count = defaultdict(set)
for word, count in mylist:
sets_by_count[count].add(word)
Which can then be followed by allcounts = sorted(sets_by_count.items(), key=itemgetter(0), reverse=True) to get a list of count, set pairs, from highest to lowest count (with minimal sorting work, since it's sorting only a number of items equal to the unique counts, not all words).
Convert list to dict with key as count and value as set of words. Find the max value of key, and it;s corresponding value
from collections import defaultdict
my_list = [('AND', 1), ('BECAUSE', 1), ('OF', 1), ('AFRIAD', 1), ('NEVER', 1), ('CATS', 2), ('ARE', 2), ('FRIENDS', 1), ('DOGS', 2)]
my_dict = defaultdict(set)
for k, v in my_list:
my_dict[v].add(k)
max_value = max(my_dict.keys())
print (my_dict[max_value], max_value)
# prints: (set(['CATS', 'ARE', 'DOGS']), 2)
While the more pythonic solutions are certainly easier on the eye, unfortunately the requirement for two scans, or building data-structures you don't really want is significantly slower.
The following fairly boring solution is about ~55% faster than the dict solution, and ~70% faster than the comprehension based solutions based on the provided example data (and my implementations, machine, benchmarking etc.)
This almost certainly down to the single scan here rather than two.
word_occs = [
('AND', 1), ('BECAUSE', 1), ('OF', 1), ('AFRIAD', 1), ('NEVER', 1),
('CATS', 2), ('ARE', 2), ('FRIENDS', 1), ('DOGS', 2)
]
def linear_scan(word_occs):
max_val = 0
max_set = None
for word, occ in word_occs:
if occ == max_val:
max_set.add(word)
elif occ > max_val:
max_val, max_set = occ, {word}
return max_set, max_val
To be fair, they are all blazing fast and in your case readability might be more important.
how to return the top n most frequently occurring chars and their respective counts # e.g 'aaaaaabbbbcccc', 2 should return [('a', 6), ('b', 4)] in python
I tried this
def top_chars(input, n):
list1=list(input)
list3=[]
list2=[]
list4=[]
set1=set(list1)
list2=list(set1)
def count(item):
count=0
for x in input:
if x in input:
count+=item.count(x)
list3.append(count)
return count
list2.sort(key=count)
list3.sort()
list4=list(zip(list2,list3))
list4.reverse()
list4.sort(key=lambda list4: ((list4[1]),(list4[0])), reverse=True)
return list4[0:n]
pass
but it doesn't work for the input ("aabc",2)
The output it should give is
[('a', 2), ('b', 1)]
but the output I get is
[('a', 2), ('c', 1)]
Use collections.Counter(); it has a most_common() method that does just that:
>>> from collections import Counter
>>> counts = Counter('aaaaaabbbbcccc')
>>> counts.most_common(2)
[('a', 6), ('c', 4)]
Note that for both the above input and in aabc both b and c have the same count, and both can be valid top contenders. Because both you and Counter sort by count then key in reverse, c is sorted before b.
If instead of sorting in reverse, you used the negative count as the sort key, you'd sort b before c again:
list4.sort(key=lambda v: (-v[1], v[0))
Not that Counter.most_common() actually uses sorting when your are asking for fewer items than there are keys in the counter; it uses a heapq-based algorithm instead to only get the top N items.
A little harder, but also works:
text = "abbbaaaa"
dict = {}
for lines in text:
for char in lines:
dict[char] = dict.get(char, 0) + 1
print dict
Text="abbbaaaa"
dict={ }
For lines in text:
For chae in lines:
dict[char]=dict.get(char,0)+1
Print dict
I have a list of lists:
[[12, 'tall', 'blue', 1],
[2, 'short', 'red', 9],
[4, 'tall', 'blue', 13]]
If I wanted to sort by one element, say the tall/short element, I could do it via s = sorted(s, key = itemgetter(1)).
If I wanted to sort by both tall/short and colour, I could do the sort twice, once for each element, but is there a quicker way?
A key can be a function that returns a tuple:
s = sorted(s, key = lambda x: (x[1], x[2]))
Or you can achieve the same using itemgetter (which is faster and avoids a Python function call):
import operator
s = sorted(s, key = operator.itemgetter(1, 2))
And notice that here you can use sort instead of using sorted and then reassigning:
s.sort(key = operator.itemgetter(1, 2))
I'm not sure if this is the most pythonic method ...
I had a list of tuples that needed sorting 1st by descending integer values and 2nd alphabetically. This required reversing the integer sort but not the alphabetical sort. Here was my solution: (on the fly in an exam btw, I was not even aware you could 'nest' sorted functions)
a = [('Al', 2),('Bill', 1),('Carol', 2), ('Abel', 3), ('Zeke', 2), ('Chris', 1)]
b = sorted(sorted(a, key = lambda x : x[0]), key = lambda x : x[1], reverse = True)
print(b)
[('Abel', 3), ('Al', 2), ('Carol', 2), ('Zeke', 2), ('Bill', 1), ('Chris', 1)]
Several years late to the party but I want to both sort on 2 criteria and use reverse=True. In case someone else wants to know how, you can wrap your criteria (functions) in parenthesis:
s = sorted(my_list, key=lambda i: ( criteria_1(i), criteria_2(i) ), reverse=True)
It appears you could use a list instead of a tuple.
This becomes more important I think when you are grabbing attributes instead of 'magic indexes' of a list/tuple.
In my case I wanted to sort by multiple attributes of a class, where the incoming keys were strings. I needed different sorting in different places, and I wanted a common default sort for the parent class that clients were interacting with; only having to override the 'sorting keys' when I really 'needed to', but also in a way that I could store them as lists that the class could share
So first I defined a helper method
def attr_sort(self, attrs=['someAttributeString']:
'''helper to sort by the attributes named by strings of attrs in order'''
return lambda k: [ getattr(k, attr) for attr in attrs ]
then to use it
# would defined elsewhere but showing here for consiseness
self.SortListA = ['attrA', 'attrB']
self.SortListB = ['attrC', 'attrA']
records = .... #list of my objects to sort
records.sort(key=self.attr_sort(attrs=self.SortListA))
# perhaps later nearby or in another function
more_records = .... #another list
more_records.sort(key=self.attr_sort(attrs=self.SortListB))
This will use the generated lambda function sort the list by object.attrA and then object.attrB assuming object has a getter corresponding to the string names provided. And the second case would sort by object.attrC then object.attrA.
This also allows you to potentially expose outward sorting choices to be shared alike by a consumer, a unit test, or for them to perhaps tell you how they want sorting done for some operation in your api by only have to give you a list and not coupling them to your back end implementation.
convert the list of list into a list of tuples then sort the tuple by multiple fields.
data=[[12, 'tall', 'blue', 1],[2, 'short', 'red', 9],[4, 'tall', 'blue', 13]]
data=[tuple(x) for x in data]
result = sorted(data, key = lambda x: (x[1], x[2]))
print(result)
output:
[(2, 'short', 'red', 9), (12, 'tall', 'blue', 1), (4, 'tall', 'blue', 13)]
Here's one way: You basically re-write your sort function to take a list of sort functions, each sort function compares the attributes you want to test, on each sort test, you look and see if the cmp function returns a non-zero return if so break and send the return value.
You call it by calling a Lambda of a function of a list of Lambdas.
Its advantage is that it does single pass through the data not a sort of a previous sort as other methods do. Another thing is that it sorts in place, whereas sorted seems to make a copy.
I used it to write a rank function, that ranks a list of classes where each object is in a group and has a score function, but you can add any list of attributes.
Note the un-lambda-like, though hackish use of a lambda to call a setter.
The rank part won't work for an array of lists, but the sort will.
#First, here's a pure list version
my_sortLambdaLst = [lambda x,y:cmp(x[0], y[0]), lambda x,y:cmp(x[1], y[1])]
def multi_attribute_sort(x,y):
r = 0
for l in my_sortLambdaLst:
r = l(x,y)
if r!=0: return r #keep looping till you see a difference
return r
Lst = [(4, 2.0), (4, 0.01), (4, 0.9), (4, 0.999),(4, 0.2), (1, 2.0), (1, 0.01), (1, 0.9), (1, 0.999), (1, 0.2) ]
Lst.sort(lambda x,y:multi_attribute_sort(x,y)) #The Lambda of the Lambda
for rec in Lst: print str(rec)
Here's a way to rank a list of objects
class probe:
def __init__(self, group, score):
self.group = group
self.score = score
self.rank =-1
def set_rank(self, r):
self.rank = r
def __str__(self):
return '\t'.join([str(self.group), str(self.score), str(self.rank)])
def RankLst(inLst, group_lambda= lambda x:x.group, sortLambdaLst = [lambda x,y:cmp(x.group, y.group), lambda x,y:cmp(x.score, y.score)], SetRank_Lambda = lambda x, rank:x.set_rank(rank)):
#Inner function is the only way (I could think of) to pass the sortLambdaLst into a sort function
def multi_attribute_sort(x,y):
r = 0
for l in sortLambdaLst:
r = l(x,y)
if r!=0: return r #keep looping till you see a difference
return r
inLst.sort(lambda x,y:multi_attribute_sort(x,y))
#Now Rank your probes
rank = 0
last_group = group_lambda(inLst[0])
for i in range(len(inLst)):
rec = inLst[i]
group = group_lambda(rec)
if last_group == group:
rank+=1
else:
rank=1
last_group = group
SetRank_Lambda(inLst[i], rank) #This is pure evil!! The lambda purists are gnashing their teeth
Lst = [probe(4, 2.0), probe(4, 0.01), probe(4, 0.9), probe(4, 0.999), probe(4, 0.2), probe(1, 2.0), probe(1, 0.01), probe(1, 0.9), probe(1, 0.999), probe(1, 0.2) ]
RankLst(Lst, group_lambda= lambda x:x.group, sortLambdaLst = [lambda x,y:cmp(x.group, y.group), lambda x,y:cmp(x.score, y.score)], SetRank_Lambda = lambda x, rank:x.set_rank(rank))
print '\t'.join(['group', 'score', 'rank'])
for r in Lst: print r
There is a operator < between lists e.g.:
[12, 'tall', 'blue', 1] < [4, 'tall', 'blue', 13]
will give
False