PySchool- List (Topic 6-22) - python

I am a beginner in python and i am trying to solve some questions about lists. I got stuck on one problem and I am not able to solve it:
Write a function countLetters(word) that takes in a word as argument
and returns a list that counts the number of times each letter
appears. The letters must be sorted in alphabetical order.
Ex:
>>> countLetters('google')
[('e', 1), ('g', 2), ('l', 1), ('o', 2)]
I am not able to count the occurrences of every character. For sorting I am using sorted(list) and I am also using dictionary(items functions) for this format of output(tuples of list). But I am not able to link all these things.

Use sets !
m = "google"
u = set(m)
sorted([(l, m.count(l)) for l in u])
>>> [('e', 1), ('g', 2), ('l', 1), ('o', 2)]

A hint: Note that you can loop through a string in the same way as a list or other iterable object in python:
def countLetters(word):
for letter in word:
print letter
countLetters("ABC")
The output will be:
A
B
C
So instead of printing, use the loop to look at what letter you've got (in your letter variable) and count it somehow.

finally, made it!!!
import collections
def countch(strng):
d=collections.defaultdict(int)
for letter in strng:
d[letter]+=1
print sorted(d.items())
This is my solution.Now, i can ask for your solutions of this problem.I would love to see your code.

Related

Character count in Python

The task is given: need to get a word from user, then total characters in the word must be counted and displayed in sorted order (count must be descending and characters must be ascending -
i.e.,
if the user gives as "management"
then the output should be
**a 2
e 2
m 2
n 2
g 1
t 1**
this is the code i written for the task:
string=input().strip()
set1=set(string)
lis=[]
for i in set1:
lis.append(i)
lis.sort()
while len(lis)>0:
maxi=0
for i in lis:
if string.count(i)>maxi:
maxi=string.count(i)
for j in lis:
if string.count(j)==maxi:
print(j,maxi)
lis.remove(j)
this code gives me following output for string "management"
a 2
m 2
e 2
n 2
g 1
t 1
m & e are not sorted.
What is wrong with my code?
The issue with your code lies in that you're trying to remove an element from the list while you're still iterating over it. This can cause problems. Presently, you remove "a", whereupon "e" takes its spot - and the list advances to the next letter, "m". Thus, "e" is skipped 'till the next iteration.
Try separating your printing and your removal, and don't remove elements from a list you're currently iterating over - instead, try adding all other elements to a new list.
string=input().strip()
set1=set(string)
lis=[]
for i in set1:
lis.append(i)
lis.sort()
while len(lis)>0:
maxi=0
for i in lis:
if string.count(i)>maxi:
maxi=string.count(i)
for j in lis:
if string.count(j)==maxi:
print(j,maxi)
dupelis = lis
lis = []
for k in dupelis:
if string.count(k)!=maxi:
lis.append(k)
managementa 2e 2m 2n 2g 1t 1
Demo
The problem with your code is the assignment of the variable maxi and the two for loops. "e" wont come second because you are assigning maxi as "2" and string.count(i) will be less than maxi.
for i in lis:
if string.count(i)>maxi:
maxi=string.count(i)
for j in lis:
if string.count(j)==maxi:
print(j,maxi)
There are several ways of achieving what you are looking for. You can try the solutions as others have explained.
you can use a simple Counter for that
from collections import Counter
Counter("management")
Counter({'a': 2, 'e': 2, 'm': 2, 'n': 2, 'g': 1, 't': 1})
I'm not really sure what you are trying to achieve by adding a while loop and then two nested for loops inside it. But the same thing can be achieved by a single for loop.
for i in lis:
print(i, string.count(i))
With this the output will be:
a 2
e 2
g 1
m 2
n 2
t 1
As answered before, you can use a Counter to get the counts of characters, no need to make a set or list.
For sorting, you'd be well off using the inbuilt sorted function which accepts a function in the key parameter. Read more about sorting and lambda functions.
>>> from collections import Counter
>>> c = Counter('management')
>>> sorted(c.items())
[('a', 2), ('e', 2), ('g', 1), ('m', 2), ('n', 2), ('t', 1)]
>>> alpha_sorted = sorted(c.items())
>>> sorted(alpha_sorted, key=lambda x: x[1])
[('g', 1), ('t', 1), ('a', 2), ('e', 2), ('m', 2), ('n', 2)]
>>> sorted(alpha_sorted, key=lambda x: x[1], reverse=True) # Reverse ensures you get descending sort
[('a', 2), ('e', 2), ('m', 2), ('n', 2), ('g', 1), ('t', 1)]
The easiest way to count the characters is to use Counter, as suggested by some previous answers. After that, the trick is to come up with a measure that takes both the count and the character into account to achieve the sorting. I have the following:
from collections import Counter
c = Counter('management')
sc = sorted(c.items(),
key=lambda x: -1000 * x[1] + ord(x[0]))
for char, count in sc:
print(char, count)
c.items() gives a list of tuples (character, count). We can use sorted() to sort them.
The parameter key is the key. sorted() puts items with lower keys (i.e. keys with smaller values) first, so I have to make a big count have a small value.
I basically give a lot of negative weight (-1000) to the count (x[1]), then augment that with the ascii value of character (ord(x[0])). The result is a sorting order that takes into account the count first, the character second.
An underlying assumption is that ord(x[0]) never exceeds 1000, which should be true of English characters.

How can I make a dictionary / collections.counter that takesz into account the index in Python?

I am aware of dictionaries and collection.Counters in Python.
My question is how can I make one that takes index of the string into account?
For example for this string: aaabaaa
I would like to make a tuples that contain each string in progression, keeping track of the count going left to right and resetting the count once a new alphanumeric is found.
For example, I like to see this output:
[('a', 3), ('b', 1), ('a', 3)]
Any idea how to use the dictionary / Counter/ or is there some other data structure built into Python I can use?
Regards
You could use groupby:
from itertools import groupby
m = [(k, sum(1 for _ in v)) for k, v in groupby('aaabaaa')]
print(m)
Output
[('a', 3), ('b', 1), ('a', 3)]
Explanation
The groupby function makes an iterator that returns consecutive keys and groups from the iterable, in this case 'aaabaaa'. The key k is the value identifying of the group, ['a', 'b', 'a']. The sum(1 for _ in v) count the amount of elements in the group.

What is the inbuilt .count in python?

I've been solving problems in checkio.com and one of the questions was: "Write a function to find the letter which occurs the maximum number of times in a given string"
The top solution was:
import string
def checkio(text):
"""
We iterate through latin alphabet and count each letter in the text.
Then 'max' selects the most frequent letter.
For the case when we have several equal letter,
'max' selects the first from they.
"""
text = text.lower()
return max(string.ascii_lowercase, key=text.count)
I didn't understand what text.count is when it is used as the key in the max function.
Edit: Sorry for not being more specific. I know what the program does as well as the function of str.count(). I want to know what text.count is. If .count is a method then shouldn't it be followed by braces?
The key=text.count is what is counting the number of times all the letters appear in the string, then you take the highest number of all those numbers to get the most frequent letter that has appeared.
When the following code is run, the result is e, which is, if you count, the most frequent letter.
import string
def checkio(text):
"""
We iterate through latin alphabet and count each letter in the text.
Then 'max' selects the most frequent letter.
For the case when we have several equal letter,
'max' selects the first from they.
"""
text = text.lower()
return max(string.ascii_lowercase, key=text.count)
print checkio('hello my name is heinst')
A key function in max() is called for each element to provide an alternative to determine the maximum by, which in this case isn't all that efficient.
Essentially, the line max(string.ascii_lowercase, key=text.count) can be translated to:
max_character, max_count = None, -1
for character in string.ascii_lowercase:
if text.count(character) > max_count:
max_character = character
return max_character
where str.count() loops through the whole of text counting how often character occurs.
You should really use a multiset / bag here instead; in Python that's provided by the collections.Counter() type:
max_character = Counter(text.lower()).most_common(1)[0][0]
The Counter() takes O(N) time to count the characters in a string of length N, then to find the maximum, another O(K) to determine the highest count, where K is the number of unique characters. Asymptotically speaking, that makes the whole process take O(N) time.
The max() approach takes O(MN) time, where M is the length of string.ascii_lowercase.
Use the Counter function from the collections module.
>>> import collections
>>> word = "supercalafragalistic"
>>> c = collections.Counter(word)
>>> c.most_common()
[('a', 4), ('c', 2), ('i', 2), ('l', 2), ('s', 2), ('r', 2), ('e', 1), ('g', 1), ('f', 1), ('p', 1), ('u', 1), ('t', 1)]
>>> c.most_common()[0]
('a', 4)

Python 3.3: Sorting a List of Objects/Methods

I apologize I am not sure of the terminology of Python.
I have a Class called "Word" and what it does is count and store all the words in a given text file as a tuple I.e.
self.listName = [('world', 2), ('hello', 3), ('stack', 1), ('overflow', 2)]
where the item of index 0 is the word and index 1 is the occurrences. This is stored within the the class "word". is there any way I can use the
listName.sort()
or
sorted(listName, key=lambda Word: Word[0])
to give me the following list:
self.listName = [('hello', 3), ('overflow', 2), ('stack', 1), ('world', 2)]
I want to try to use this rather than attempting to create a new sorting function (which I believe I can do, but I have not been successful)?
I think I should also mention that the List and the Word classes are in different classes (if that makes a difference).
thanks in advance!
You don't need to specify the key at all,
listName = [('world', 2), ('hello', 3), ('stack', 1), ('overflow', 2)]
print(sorted(listName))
Output
[('hello', 3), ('overflow', 2), ('stack', 1), ('world', 2)]
For more information about how this comparison is done, please check this documentation page Comparing Sequences and Other Types
To sort your list,
listname.sort(key=lambda x:x[0])
should sort the list in place alphabetically by word.
Also, you may want to look up
collections.counter
as it serves a similar purpose to what you are doing
I actually got it to work, I needed a "over rider method" (? - where I need to define __lt__ to compare the str element of Word()) though because I got this error:
TypeError: unorderable types: str() < Word()

frequently occuring character in a string in alphabetical order in python without using control flow

The question is to find the most frequently occuring characters in a string. The most frequently occuring characters must be the output in descending order. in case of a tie,i.e, the same occurances, the same frequency characters must be outputted in alphabetical order.
for example:
s="aaccbba"
the output should be
(('a',3),('b',2),('c',2))
and not
(('a',3),('c',2),('b',2))
note: you shouldn't be using control flow statements
the python version I am using is 2.7.5
I even tried using counters, but it's of no help.
Use collections.Counter:
>>> from collections import Counter
>>> Counter("aaccbba").most_common()
[('a', 3), ('c', 2), ('b', 2)]
The output can be sorted:
>>> sorted(Counter("aaccbba").most_common(), key=lambda v: (-v[1], v[0]))
[('a', 3), ('b', 2), ('c', 2)]
but really, there is no real difference between listing b first or c first; they are otherwise equal.
Sorting the output does double work; Counter() already sorted it's items for you, the above just sorts the lot again with slightly different criteria.

Categories