What is the inbuilt .count in python? - python

I've been solving problems in checkio.com and one of the questions was: "Write a function to find the letter which occurs the maximum number of times in a given string"
The top solution was:
import string
def checkio(text):
"""
We iterate through latin alphabet and count each letter in the text.
Then 'max' selects the most frequent letter.
For the case when we have several equal letter,
'max' selects the first from they.
"""
text = text.lower()
return max(string.ascii_lowercase, key=text.count)
I didn't understand what text.count is when it is used as the key in the max function.
Edit: Sorry for not being more specific. I know what the program does as well as the function of str.count(). I want to know what text.count is. If .count is a method then shouldn't it be followed by braces?

The key=text.count is what is counting the number of times all the letters appear in the string, then you take the highest number of all those numbers to get the most frequent letter that has appeared.
When the following code is run, the result is e, which is, if you count, the most frequent letter.
import string
def checkio(text):
"""
We iterate through latin alphabet and count each letter in the text.
Then 'max' selects the most frequent letter.
For the case when we have several equal letter,
'max' selects the first from they.
"""
text = text.lower()
return max(string.ascii_lowercase, key=text.count)
print checkio('hello my name is heinst')

A key function in max() is called for each element to provide an alternative to determine the maximum by, which in this case isn't all that efficient.
Essentially, the line max(string.ascii_lowercase, key=text.count) can be translated to:
max_character, max_count = None, -1
for character in string.ascii_lowercase:
if text.count(character) > max_count:
max_character = character
return max_character
where str.count() loops through the whole of text counting how often character occurs.
You should really use a multiset / bag here instead; in Python that's provided by the collections.Counter() type:
max_character = Counter(text.lower()).most_common(1)[0][0]
The Counter() takes O(N) time to count the characters in a string of length N, then to find the maximum, another O(K) to determine the highest count, where K is the number of unique characters. Asymptotically speaking, that makes the whole process take O(N) time.
The max() approach takes O(MN) time, where M is the length of string.ascii_lowercase.

Use the Counter function from the collections module.
>>> import collections
>>> word = "supercalafragalistic"
>>> c = collections.Counter(word)
>>> c.most_common()
[('a', 4), ('c', 2), ('i', 2), ('l', 2), ('s', 2), ('r', 2), ('e', 1), ('g', 1), ('f', 1), ('p', 1), ('u', 1), ('t', 1)]
>>> c.most_common()[0]
('a', 4)

Related

Character count in Python

The task is given: need to get a word from user, then total characters in the word must be counted and displayed in sorted order (count must be descending and characters must be ascending -
i.e.,
if the user gives as "management"
then the output should be
**a 2
e 2
m 2
n 2
g 1
t 1**
this is the code i written for the task:
string=input().strip()
set1=set(string)
lis=[]
for i in set1:
lis.append(i)
lis.sort()
while len(lis)>0:
maxi=0
for i in lis:
if string.count(i)>maxi:
maxi=string.count(i)
for j in lis:
if string.count(j)==maxi:
print(j,maxi)
lis.remove(j)
this code gives me following output for string "management"
a 2
m 2
e 2
n 2
g 1
t 1
m & e are not sorted.
What is wrong with my code?
The issue with your code lies in that you're trying to remove an element from the list while you're still iterating over it. This can cause problems. Presently, you remove "a", whereupon "e" takes its spot - and the list advances to the next letter, "m". Thus, "e" is skipped 'till the next iteration.
Try separating your printing and your removal, and don't remove elements from a list you're currently iterating over - instead, try adding all other elements to a new list.
string=input().strip()
set1=set(string)
lis=[]
for i in set1:
lis.append(i)
lis.sort()
while len(lis)>0:
maxi=0
for i in lis:
if string.count(i)>maxi:
maxi=string.count(i)
for j in lis:
if string.count(j)==maxi:
print(j,maxi)
dupelis = lis
lis = []
for k in dupelis:
if string.count(k)!=maxi:
lis.append(k)
managementa 2e 2m 2n 2g 1t 1
Demo
The problem with your code is the assignment of the variable maxi and the two for loops. "e" wont come second because you are assigning maxi as "2" and string.count(i) will be less than maxi.
for i in lis:
if string.count(i)>maxi:
maxi=string.count(i)
for j in lis:
if string.count(j)==maxi:
print(j,maxi)
There are several ways of achieving what you are looking for. You can try the solutions as others have explained.
you can use a simple Counter for that
from collections import Counter
Counter("management")
Counter({'a': 2, 'e': 2, 'm': 2, 'n': 2, 'g': 1, 't': 1})
I'm not really sure what you are trying to achieve by adding a while loop and then two nested for loops inside it. But the same thing can be achieved by a single for loop.
for i in lis:
print(i, string.count(i))
With this the output will be:
a 2
e 2
g 1
m 2
n 2
t 1
As answered before, you can use a Counter to get the counts of characters, no need to make a set or list.
For sorting, you'd be well off using the inbuilt sorted function which accepts a function in the key parameter. Read more about sorting and lambda functions.
>>> from collections import Counter
>>> c = Counter('management')
>>> sorted(c.items())
[('a', 2), ('e', 2), ('g', 1), ('m', 2), ('n', 2), ('t', 1)]
>>> alpha_sorted = sorted(c.items())
>>> sorted(alpha_sorted, key=lambda x: x[1])
[('g', 1), ('t', 1), ('a', 2), ('e', 2), ('m', 2), ('n', 2)]
>>> sorted(alpha_sorted, key=lambda x: x[1], reverse=True) # Reverse ensures you get descending sort
[('a', 2), ('e', 2), ('m', 2), ('n', 2), ('g', 1), ('t', 1)]
The easiest way to count the characters is to use Counter, as suggested by some previous answers. After that, the trick is to come up with a measure that takes both the count and the character into account to achieve the sorting. I have the following:
from collections import Counter
c = Counter('management')
sc = sorted(c.items(),
key=lambda x: -1000 * x[1] + ord(x[0]))
for char, count in sc:
print(char, count)
c.items() gives a list of tuples (character, count). We can use sorted() to sort them.
The parameter key is the key. sorted() puts items with lower keys (i.e. keys with smaller values) first, so I have to make a big count have a small value.
I basically give a lot of negative weight (-1000) to the count (x[1]), then augment that with the ascii value of character (ord(x[0])). The result is a sorting order that takes into account the count first, the character second.
An underlying assumption is that ord(x[0]) never exceeds 1000, which should be true of English characters.

Run Length encoding of symbols

I am trying to write a run length encoding code using python.If a message consist of long sequence of symbols. I am meant to encode it to the as a list of the symbol and the number of times it occurs.This is my code
alphabets = ['a','b','c','d','e','f','g','h','i','j','k',
'l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']
char_count = 0
translate = ''
words = input('Enter your word: ')
for char in words:
if char in alphabets:
char_count += 1
translate += char + str(char_count)
print(translate)
When I run my program this is what I get.
Enter your word: abbbbaaabbaaa
a1b2b3b4b5a6a7a8b9b10a11a12a13
The output is actually meant to be.
a1b4a3b2a3
Is there a way to fix this?
You can simply use regular expressions to solve the problem:
import re
translate = re.sub(r"((.)\2*)", lambda x: x.group(2) + str(len(x.group(1))), words)
This regex finds all groups of similar consecutive symbols in the words string and replaces them by its length encoding.
One possible way is to use itertools.groupby:
from itertools import groupby
''.join([f'{letter}{len(list(grouper))}' for letter, grouper in groupby(words)])
Explanation
itertools.groupby splits the string into chunks of same letters, converts each chunk into a pair (letter, grouper) and returns an object generating these pairs:
>>> groupby('abbbbaaabbaaa')
<itertools.groupby at 0x6fffeafa098>
>>> for chunk in groupby('abbbbaaabbaaa'):
print(chunk)
('a', <itertools._grouper object at 0x6fffeaf2cf8>)
('b', <itertools._grouper object at 0x6fffeae9908>)
('a', <itertools._grouper object at 0x6fffeae9898>)
('b', <itertools._grouper object at 0x6fffeaf2320>)
('a', <itertools._grouper object at 0x6fffeae9898>)
Each itertools._grouper object is again a generator which generates all the letters in the corresponding chunk. By converting it to a list, we can check its length and append it to the result.

Manipulating counter information - Python 2.7

I'm fairly new to Python and I have this program that I was tinkering with. It's supposed to get a string from input and display which character is the most frequent.
stringToData = raw_input("Please enter your string: ")
# imports collections class
import collections
# gets the data needed from the collection
letter, count = collections.Counter(stringToData).most_common(1)[0]
# prints the results
print "The most frequent character is %s, which occurred %d times." % (
letter, count)
However, if the string has 1 of each character, it only displays one letter and says it's the most frequent character. I thought about changing the number in the parenthesis in most_common(number), but I didn't want more to display how many times the other letters every time.
Thank you to all that help!
As I explained in the comment:
You can leave off the parameter to most_common to get a list of all characters, ordered from most common to least common. Then just loop through that result and collect the characters as long as the counter value is still the same. That way you get all characters that are most common.
Counter.most_common(n) returns the n most common elements from the counter. Or in case where n is not specified, it will return all elements from the counter, ordered by the count.
>>> collections.Counter('abcdab').most_common()
[('a', 2), ('b', 2), ('c', 1), ('d', 1)]
You can use this behavior to simply loop through all elements, ordered by their count. As long as the count is the same as of the first element in the output, you know that the element still ocurred in the same quantity in the string.
>>> c = collections.Counter('abcdefgabc')
>>> maxCount = c.most_common(1)[0][1]
>>> elements = []
>>> for element, count in c.most_common():
if count != maxCount:
break
elements.append(element)
>>> elements
['a', 'c', 'b']
>>> [e for e, c in c.most_common() if c == maxCount]
['a', 'c', 'b']

frequently occuring character in a string in alphabetical order in python without using control flow

The question is to find the most frequently occuring characters in a string. The most frequently occuring characters must be the output in descending order. in case of a tie,i.e, the same occurances, the same frequency characters must be outputted in alphabetical order.
for example:
s="aaccbba"
the output should be
(('a',3),('b',2),('c',2))
and not
(('a',3),('c',2),('b',2))
note: you shouldn't be using control flow statements
the python version I am using is 2.7.5
I even tried using counters, but it's of no help.
Use collections.Counter:
>>> from collections import Counter
>>> Counter("aaccbba").most_common()
[('a', 3), ('c', 2), ('b', 2)]
The output can be sorted:
>>> sorted(Counter("aaccbba").most_common(), key=lambda v: (-v[1], v[0]))
[('a', 3), ('b', 2), ('c', 2)]
but really, there is no real difference between listing b first or c first; they are otherwise equal.
Sorting the output does double work; Counter() already sorted it's items for you, the above just sorts the lot again with slightly different criteria.

PySchool- List (Topic 6-22)

I am a beginner in python and i am trying to solve some questions about lists. I got stuck on one problem and I am not able to solve it:
Write a function countLetters(word) that takes in a word as argument
and returns a list that counts the number of times each letter
appears. The letters must be sorted in alphabetical order.
Ex:
>>> countLetters('google')
[('e', 1), ('g', 2), ('l', 1), ('o', 2)]
I am not able to count the occurrences of every character. For sorting I am using sorted(list) and I am also using dictionary(items functions) for this format of output(tuples of list). But I am not able to link all these things.
Use sets !
m = "google"
u = set(m)
sorted([(l, m.count(l)) for l in u])
>>> [('e', 1), ('g', 2), ('l', 1), ('o', 2)]
A hint: Note that you can loop through a string in the same way as a list or other iterable object in python:
def countLetters(word):
for letter in word:
print letter
countLetters("ABC")
The output will be:
A
B
C
So instead of printing, use the loop to look at what letter you've got (in your letter variable) and count it somehow.
finally, made it!!!
import collections
def countch(strng):
d=collections.defaultdict(int)
for letter in strng:
d[letter]+=1
print sorted(d.items())
This is my solution.Now, i can ask for your solutions of this problem.I would love to see your code.

Categories