I'm fairly new to Python and I have this program that I was tinkering with. It's supposed to get a string from input and display which character is the most frequent.
stringToData = raw_input("Please enter your string: ")
# imports collections class
import collections
# gets the data needed from the collection
letter, count = collections.Counter(stringToData).most_common(1)[0]
# prints the results
print "The most frequent character is %s, which occurred %d times." % (
letter, count)
However, if the string has 1 of each character, it only displays one letter and says it's the most frequent character. I thought about changing the number in the parenthesis in most_common(number), but I didn't want more to display how many times the other letters every time.
Thank you to all that help!
As I explained in the comment:
You can leave off the parameter to most_common to get a list of all characters, ordered from most common to least common. Then just loop through that result and collect the characters as long as the counter value is still the same. That way you get all characters that are most common.
Counter.most_common(n) returns the n most common elements from the counter. Or in case where n is not specified, it will return all elements from the counter, ordered by the count.
>>> collections.Counter('abcdab').most_common()
[('a', 2), ('b', 2), ('c', 1), ('d', 1)]
You can use this behavior to simply loop through all elements, ordered by their count. As long as the count is the same as of the first element in the output, you know that the element still ocurred in the same quantity in the string.
>>> c = collections.Counter('abcdefgabc')
>>> maxCount = c.most_common(1)[0][1]
>>> elements = []
>>> for element, count in c.most_common():
if count != maxCount:
break
elements.append(element)
>>> elements
['a', 'c', 'b']
>>> [e for e, c in c.most_common() if c == maxCount]
['a', 'c', 'b']
Related
This question already has answers here:
Determine prefix from a set of (similar) strings
(11 answers)
Closed 2 years ago.
I need to know how to identify prefixes in strings in a list. For example,
list = ['nomad', 'normal', 'nonstop', 'noob']
Its answer should be 'no' since every string in the list starts with 'no'
I was wondering if there is a method that iterates each letter in strings in the list at the same time and checks each letter is the same with each other.
Use os.path.commonprefix it will do exactly what you want.
In [1]: list = ['nomad', 'normal', 'nonstop', 'noob']
In [2]: import os.path as p
In [3]: p.commonprefix(list)
Out[3]: 'no'
As an aside, naming a list "list" will make it impossible to access the list class, so I would recommend using a different variable name.
Here is a code without libraries:
for i in range(len(l[0])):
if False in [l[0][:i] == j[:i] for j in l]:
print(l[0][:i-1])
break
gives output:
no
There is no built-in function to do this. If you are looking for short python code that can do this for you, here's my attempt:
def longest_common_prefix(words):
i = 0
while len(set([word[:i] for word in words])) <= 1:
i += 1
return words[0][:i-1]
Explanation: words is an iterable of strings. The list comprehension
[word[:i] for word in words]
uses string slices to take the first i letters of each string. At the beginning, these would all be empty strings. Then, it would consist of the first letter of each word. Then the first two letters, and so on.
Casting to a set removes duplicates. For example, set([1, 2, 2, 3]) = {1, 2, 3}. By casting our list of prefixes to a set, we remove duplicates. If the length of the set is less than or equal to one, then they are all identical.
The counter i just keeps track of how many letters are identical so far.
We return words[0][i-1]. We arbitrarily choose the first word and take the first i-1 letters (which would be the same for any word in the list). The reason that it's i-1 and not i is that i gets incremented before we check if all of the words still share the same prefix.
Here's a fun one:
l = ['nomad', 'normal', 'nonstop', 'noob']
def common_prefix(lst):
for s in zip(*lst):
if len(set(s)) == 1:
yield s[0]
else:
return
result = ''.join(common_prefix(l))
Result:
'no'
To answer the spirit of your question - zip(*lst) is what allows you to "iterate letters in every string in the list at the same time". For example, list(zip(*lst)) would look like this:
[('n', 'n', 'n', 'n'), ('o', 'o', 'o', 'o'), ('m', 'r', 'n', 'o'), ('a', 'm', 's', 'b')]
Now all you need to do is find out the common elements, i.e. the len of set for each group, and if they're common (len(set(s)) == 1) then join it back.
As an aside, you probably don't want to call your list by the name list. Any time you call list() afterwards is gonna be a headache. It's bad practice to shadow built-in keywords.
I have a list containing a string in each element. I want to compare the characters of each string starting from the first character to the end. The loop loops over the length of the shortest string in the list.
For example:
strs = ["flower", "flow", "flight"]
The comparison would look something like this:
for sub_i in range(len(min(strs, key=len))):
if(strs[0][sub_i] == strs[1][sub_i] == strs[2][sub_i]):
#do something
How would I expand this so that I can have an arbitrary number of elements in strs? (Instead of just 3 in my example)
For some k
len(set([s[:k] for s in strs])) == 1
Example:
strs = ["flower", "flow", "flight"]
k = 2
if len(set([s[:k] for s in strs])) == 1:
# do something
print ("same")
Output:
same
For arbitrary lengths, you can zip() the strings. This will automatically iterated using the length of the shortest string. Then determine if all the letters are the same. Below converts it to a set() and checks the length (which will be 1 if all elements are equal), but of course, there are other ways:
strs = ["flower", "flow", "flight"]
for letters in zip(*strs):
if len(set(letters)) == 1:
# do something
print(letters)
Prints:
('f', 'f', 'f')
('l', 'l', 'l')
I need the element that appears only occur once. (python)
For example the result for
mylist = ['a', 'a', 'a', 'a', 'b', 'c']
would be
2
You can use collections.Counter to count the number of occurrences of each distinct item, and retain only those with a count of 1 with a generator expression:
from collections import Counter
sum(1 for c in Counter(mylist).values() if c == 1)
This returns: 2
This situation looks like a pure Set structure.
If I were you I would turn the array to set and check the size of it.
You can check examples how to do it here
You basically want to iterate through the list and check to see how many times each element occurs in the list. If it occurs more than once, you don't want it but if it occurs only once, you increase your counter by 1.
count = 0
for letter in mylist:
if mylist.count(letter) == 1:
count += 1
print (count)
This should work for you:
len(set(mylist))
It does require your values to be hashable.
I've been solving problems in checkio.com and one of the questions was: "Write a function to find the letter which occurs the maximum number of times in a given string"
The top solution was:
import string
def checkio(text):
"""
We iterate through latin alphabet and count each letter in the text.
Then 'max' selects the most frequent letter.
For the case when we have several equal letter,
'max' selects the first from they.
"""
text = text.lower()
return max(string.ascii_lowercase, key=text.count)
I didn't understand what text.count is when it is used as the key in the max function.
Edit: Sorry for not being more specific. I know what the program does as well as the function of str.count(). I want to know what text.count is. If .count is a method then shouldn't it be followed by braces?
The key=text.count is what is counting the number of times all the letters appear in the string, then you take the highest number of all those numbers to get the most frequent letter that has appeared.
When the following code is run, the result is e, which is, if you count, the most frequent letter.
import string
def checkio(text):
"""
We iterate through latin alphabet and count each letter in the text.
Then 'max' selects the most frequent letter.
For the case when we have several equal letter,
'max' selects the first from they.
"""
text = text.lower()
return max(string.ascii_lowercase, key=text.count)
print checkio('hello my name is heinst')
A key function in max() is called for each element to provide an alternative to determine the maximum by, which in this case isn't all that efficient.
Essentially, the line max(string.ascii_lowercase, key=text.count) can be translated to:
max_character, max_count = None, -1
for character in string.ascii_lowercase:
if text.count(character) > max_count:
max_character = character
return max_character
where str.count() loops through the whole of text counting how often character occurs.
You should really use a multiset / bag here instead; in Python that's provided by the collections.Counter() type:
max_character = Counter(text.lower()).most_common(1)[0][0]
The Counter() takes O(N) time to count the characters in a string of length N, then to find the maximum, another O(K) to determine the highest count, where K is the number of unique characters. Asymptotically speaking, that makes the whole process take O(N) time.
The max() approach takes O(MN) time, where M is the length of string.ascii_lowercase.
Use the Counter function from the collections module.
>>> import collections
>>> word = "supercalafragalistic"
>>> c = collections.Counter(word)
>>> c.most_common()
[('a', 4), ('c', 2), ('i', 2), ('l', 2), ('s', 2), ('r', 2), ('e', 1), ('g', 1), ('f', 1), ('p', 1), ('u', 1), ('t', 1)]
>>> c.most_common()[0]
('a', 4)
def display_hand(hand):
for letter in hand.keys():
for j in range(hand[letter]):
print letter,
Will return something like: b e h q u w x. This is the desired output.
How can I modify this code to get the output only when the function has finished its loops?
Something like below code causes me problems as I can't get rid of dictionary elements like commas and single quotes when printing the output:
def display_hand(hand):
dispHand = []
for letter in hand.keys():
for j in range(hand[letter]):
##code##
print dispHand
UPDATE
John's answer is very elegant i find. Allow me however to expand o Kugel's response:
Kugel's approach answered my question. However i kept running into an additional issue: the function would always return None as well as the output. Reason: Whenever you don't explicitly return a value from a function in Python, None is implicitly returned. I couldn't find a way to explicitly return the hand. In Kugel's approach i got closer but the hand is still buried in a FOR loop.
You can do this in one line by combining a couple of list comprehensions:
print ' '.join(letter for letter, count in hand.iteritems() for i in range(count))
Let's break that down piece by piece. I'll use a sample dictionary that has a couple of counts greater than 1, to show the repetition part working.
>>> hand
{'h': 3, 'b': 1, 'e': 2}
Get the letters and counts in a form that we can iterate over.
>>> list(hand.iteritems())
[('h', 3), ('b', 1), ('e', 2)]
Now just the letters.
>>> [letter for letter, count in hand.iteritems()]
['h', 'b', 'e']
Repeat each letter count times.
>>> [letter for letter, count in hand.iteritems() for i in range(count)]
['h', 'h', 'h', 'b', 'e', 'e']
Use str.join to join them into one string.
>>> ' '.join(letter for letter, count in hand.iteritems() for i in range(count))
'h h h b e e'
Your ##code perhaps?
dispHand.append(letter)
Update:
To print your list then:
for item in dispHand:
print item,
another option without nested loop
"".join((x+' ') * y for x, y in hand.iteritems()).strip()
Use
" ".join(sequence)
to print a sequence without commas and the enclosing brackets.
If you have integers or other stuff in the sequence
" ".join(str(x) for x in sequence)