Counting number of occurrences in a string - python

I need to return a dictionary that counts the number of times each letter in a predetermined list occurs. The problem is that I need to count both Upper and Lower case letters as the same, so I can't use .lower or .upper.
So, for example, "This is a Python string" should return {'t':3} if "t" is the letter being searched for.
Here is what I have so far...
def countLetters(fullText, letters):
countDict = {i:0 for i in letters}
lowerString = fullText.lower()
for i in lowerString:
if i in letters:
countDict[i] += 1
return countDict
Where 'letters' is the condition and fullText is the string I am searching.
The obvious issue here is that if the test is "T" rather than "t", my code won't return anything Sorry for any errors in my terminology, I am pretty new to this. Any help would be much appreciated!

To ignore capitalization, you need to input =
input = input.lower ()
.Lists all characters of the input text using list operations.
It can also be used as a word counter if you scan the space character.
input = "Batuq batuq BatuQ" # Reads all inputs up to the EOF character
input = input.replace('-',' ')#Replace (-, + .etc) expressions with a space character.
input = input.replace('.','')
input = input.replace(',','')
input = input.replace("`",'')
input = input.replace("'",'')
#input= input.split(' ') #if you use it, it will sort by the most repetitive words
dictionary = dict()
count = 0
for word in input:
dictionary[word] = input.count(word)
print(dictionary)
#Writes the 5 most repetitive characters
for k in sorted(dictionary,key=dictionary.get,reverse=True)[:5]:
print(k,dictionary[k])

Would something like this work that handles both case sensitive letter counts and non case sensitive counts?
from typing import List
def count_letters(
input_str: str,
letters: List[str],
count_case_sensitive: bool=True
):
"""
count_letters consumes a list of letters and an input string
and returns a dictionary of counts by letter.
"""
if count_case_sensitive is False:
input_str = input_str.lower()
letters = list(set(map(lambda x: x.lower(), letters)))
# dict comprehension - build your dict in one line
# Tutorial on dict comprehensions: https://www.datacamp.com/community/tutorials/python-dictionary-comprehension
counts = {letter: input_str.count(letter) for letter in letters}
return counts
# define function inputs
letters = ['t', 'a', 'z', 'T']
string = 'this is an example with sTrings and zebras and Zoos'
# case sensitive
count_letters(
string,
letters,
count_case_sensitive=True
)
# {'t': 2, 'a': 5, 'z': 1, 'T': 1}
# not case sensitive
count_letters(
string,
letters,
count_case_sensitive=False
)
# {'a': 5, 'z': 2, 't': 3} # notice input T is now just t in dictionary of counts

Try it - like this:
def count_letters(fullText, letters):
countDict = {i: 0 for i in letters}
lowerString = fullText.lower()
for i in lowerString:
if i in letters:
countDict[i] += 1
return countDict
test = "This is a Python string."
print(count_letters(test, 't')) #Output: 3

You're looping over the wrong string. You need to loop over lowerString, not fullString, so you ignore the case when counting.
It's also more efficient to do if i in countDict than if i in letter.
def countLetters(fullText, letters):
countDict = {i.lower():0 for i in letters}
lowerString = fullText.lower()
for i in lowerString:
if i in countDict:
countDict[i] += 1
return countDict

What you can do is simply duplicate the dict with both upper and lowercase like so:
def countLetters(fullText, letters):
countDict = {}
for i in letters:
countDict[i.upper()]=0
countDict[i.lower()]=0
lowerString = fullText.lower()
letters = letters.lower()
for i in lowerString:
if i in letters:
countDict[i] += 1
if (i!=i.upper()):
countDict[i.upper()] +=1
return countDict
print(countLetters("This is a Python string", "TxZY"))
Now some things you can also do is loop over the original string and change countDict[i] += 1 to countDict[i.lower()] +=1

Use the Counter from the collections module
from collections import Counter
input = "Batuq batuq BatuQ"
bow=input.split(' ')
results=Counter(bow)
print(results)
output:
Counter({'Batuq': 1, 'batuq': 1, 'BatuQ': 1})

Related

how can I get a dictionary to display 0 when something appears no times?

I want to search a list of strings for the letter a and create a dictionary that shows how many times it appeared. What I have now does this except when a doesn't appear it returns an empty dictionary. I want it to return {a: 0}. Here's what I have now in a function where its indented properly. what can i change in what i have now so that the created dictionary is {a:0}
list_strings = (["this","sentence"]
result ={}
for letter in list_strings:
list_strings = list_strings.lower()
for letter in letters:
if letter == "a":
if letter not in result:
result[letter] = 0
result[letter]=result[letter] + 1
return result
You can put {'a': 0} in the result ahead of time:
result = {'a': 0}
And then do the next loop
If you want to count every letter in list of words, the defaultdict is useful.
from collections import defaultdict
list_strings = ["this", "sentence", "apple"]
result = defaultdict(int)
result.update({'a': 0})
for word in list_strings:
for letter in word:
result[letter] += 1
print(result)
print(result.get('t', 0))
print(result.get('a', 0))
After that, you can take value by function: get, the second parameter is optional, if element not in dictionary, get will return the second parameter.
There are some errors in the posted code.
Guess what you need is to count the number of "a"s in all the strings in list_strings and store it in a dictionary. If there are no "a"s you need to be dictionary value for "a" to be 0.
You can initialize dictionary value "a" to zero at the beginning. I have corrected errors and do this in the below code.
list_strings = (["this","sentence"])
result ={}
result["a"] = 0
for string in list_strings:
string = string.lower()
for letter in string:
if letter == "a":
result[letter]=result[letter] + 1
print(result)
If you need to count other characters you can create the initial dictionary as follows.
d1 = dict.fromkeys(string.ascii_lowercase, 0)

Can't get my head around the problem, list index out of range (inside 3 loops)

I know list index out of range has been covered a million times before and I know the issue is probably that I am trying to reference an index position that does not exist but as there are 3 for loops nested I just cant figure out what is going on.
I am trying to calculate the frequency of each letter of the alphabet in a list of words.
alphabet_string = string.ascii_uppercase
g = list(alphabet_string)
a_count = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
y = 0
for word in words:
for chars in word:
for letters in chars:
if letters == g[y]:
a_count[y] = a_count[y] +1
y = y + 1
print(a_count[0])
word is in the format of: ['ZYMIC']
chars is in the format of: ZYMIC
letters is in the format of: C
If I substitute the y value for a value between 0 and 25 then it returns as expected.
I have a feeling the issue is as stated above that I am exceeding the index number of 25, so I guess y = y + 1 is in the wrong position. I have however tried it in different positions.
Any help would be appreciated.
Thanks!
Edit: Thanks everyone so much, never had this many responses before, all very helpful!
Storing a_count as a dictionary is the better option for this problem.
a_count = {}
for word in words:
for chars in word:
for letters in chars:
a_count[letters] = a_count.get(letters, 0) + 1
You can also use the Counter() class from the collections library.
from collections import Counter
a_count = Counter()
for word in words:
for chars in word:
for letters in chars:
a_count[letters] += 1
print(a.most_common())
Solution via Counter -
from collections import Counter
words = ['TEST','ZYMIC']
print(Counter(''.join(words)))
If you wanna stick to your code then change the if condition -
when y = 0 g[y] means 'A' and you're checking if 'A' == 'Z' which is the 1st letter. Basically, you need to fetch the index location of the element from list g and increase the value by 1. That's what you need to do to make it work. If I understood your problem correctly.
import string
words = ['ZYMIC']
alphabet_string = string.ascii_uppercase
g = list(alphabet_string)
a_count = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
for word in words:
for chars in word:
for letters in chars:
if letters in g:
y = g.index(letters)
a_count[y] += 1
print(a_count)
And you can very well replace the if condition, and check for the index directly because the letter will always be there in g. Therefore, this particular condition is redundant here.
for word in words:
for chars in word:
for letters in chars:
y = g.index(letters)
a_count[y] += 1
I think it's because of list a_count.
I would suggest another approach here, based on dictionaries:
listeletters = ['foihroh','ZYMIC','ajnaisui', 'fjindsonosn']
alphabeth = {'a' : 0,
'b' : 0,
'c': 0}
for string in listeletters:
for l in string:
if l in alphabeth.keys():
alphabeth[l] = alphabeth[l] + 1
print(alphabeth)
I inialize the alphabeth and then I get the result wanted

Code not running properly

The code I have below is supposed to run the number of words that start with a certain letter, but when I run it, the counts are all 0, instead of what they should be: {'I': 2, 'b': 2, 't': 3, 'f': 1}. I appreciate any help. Thanks!
def initialLets(keyStr):
'''Return a dictionary in which each key is the initial letter of a word in t and the value is the number of words that begin with that letter. Upper
and lower case letters should be considered different letters.'''
inLets = {}
strList = keyStr.split()
firstLets = []
for words in strList:
if words[0] not in firstLets:
firstLets.append(words[0])
for lets in firstLets:
inLets[lets] = strList.count(lets)
return inLets
text = "I'm born to trouble I'm born to fate"
print(initialLets(text))
You can try this:
text = "I'm born to trouble I'm born to fate"
new_text = text.split()
final_counts = {i[0]:sum(b.startswith(i[0]) for b in new_text) for i in new_text}
Output:
{'I': 2, 'b': 2, 't': 3, 'f': 1}
You don't have a counter as you append the letter but not its number of occurrences.
So to simplify:
def initialLets(keyStr):
'''Return a dictionary in which each key is the initial letter of a word in t and the value is the number of words that begin with that letter. Upper
and lower case letters should be considered different letters.'''
strList = keyStr.split()
# We initiate the variable that gonna take our results
result = {}
for words in strList:
if words[0] not in result:
# if first letter not in result then we add it to result with counter = 1
result[words[0]] = 1
else:
# We increase the number of occurence
result[words[0]] += 1
return result
text = "I'm born to trouble I'm born to fate"
print(initialLets(text))
Firstly, you are checking if the first letter of the word is in the list before putting it in. That would just make the list comprise of only 1 of each letter. Secondly, your strList is a list of each word, instead of inLets[lets] = strList.count(lets), it should be inLets[lets] = firstLets.count(lets)... While your current code isn't the cleanest way to do it, this minor modification would have worked.
def initialLets(keyStr):
'''Return a dictionary in which each key is the initial letter of a word in t and the value is the number of words that begin with that letter. Upper
and lower case letters should be considered different letters.'''
inLets = {}
strList = keyStr.split()
firstLets = []
for words in strList:
firstLets.append(words[0])
for lets in firstLets:
inLets[lets] = firstLets.count(lets)
return inLets
text = "I'm born to trouble I'm born to fate"
print(initialLets(text))

Find the last vowel in a string

I cant seem to find the proper way to search a string for the last vowel, and store any unique consonants after that last vowel. I have it set up like this so far.
word = input('Input a word: ')
wordlow = word.lower()
VOWELS = 'aeiou'
last_vowel_index = 0
for i, ch in enumerate(wordlow):
if ch == VOWELS:
last_vowel_index += i
print(wordlow[last_vowel_index + 1:])
I like COLDSPEED's approach, but for completeness, I will suggest a regex based solution:
import re
s = 'sjdhgdfgukgdk'
re.search(r'([^AEIOUaeiou]*)$', s).group(1)
# 'kgdk'
# '[^AEIOUaeiou]' matches a non-vowel (^ being the negation)
# 'X*' matches 0 or more X
# '$' matches the end of the string
# () marks a group, group(1) returns the first such group
See the docs on python regular expression syntax. Further processing is also needed for the uniqueness part ;)
You can reverse your string, and use itertools.takewhile to take everything until the "last" (now the first after reversal) vowel:
from itertools import takewhile
out = ''.join(takewhile(lambda x: x not in set('aeiou'), string[::-1]))[::-1]
print(out)
'ng'
If there are no vowels, the entire string is returned. Another thing to note is that, you should convert your input string to lower case using a str.lower call, otherwise you risk not counting uppercase vowels.
If you want unique consonants only (without any repetition), a further step is needed:
from collections import OrderedDict
out = ''.join(OrderedDict.fromkeys(out).keys())
Here, the OrderedDict lets us keep order while eliminating duplicates, since, the keys must be unique in any dictionary.
Alternatively, if you want consonants that only appear once, use:
from collections import Counter
c = Counter(out)
out = ''.join(x for x in out if c[x] == 1)
You can simply write a function for that:
def func(astr):
vowels = set('aeiouAEIOU')
# Container for all unique not-vowels after the last vowel
unique_notvowels = set()
# iterate over reversed string that way you don't need to reset the index
# every time a vowel is encountered.
for idx, item in enumerate(astr[::-1], 1):
if item in vowels:
# return the vowel, the index of the vowel and the container
return astr[-idx], len(astr)-idx, unique_notvowels
unique_notvowels.add(item)
# In case no vowel is found this will raise an Exception. You might want/need
# a different behavior...
raise ValueError('no vowels found')
For example:
>>> func('asjhdskfdsbfkdes')
('e', 14, {'s'})
>>> func('asjhdskfdsbfkds')
('a', 0, {'b', 'd', 'f', 'h', 'j', 'k', 's'})
It returns the last vowel, the index of the vowel and all unique not-vowels after the last vowel.
In case the vowels should be ordered you need to use an ordered container instead of the set, for example a list (could be much slower) or collections.OrderedDict (more memory expensive but faster than the list).
You can just reverse your string and loop over each letter until you encounter the first vowel:
for i, letter in enumerate(reversed(word)):
if letter in VOWELS:
break
print(word[-i:])
last_vowel will return the last vowel in the word
last_index will give you the last index of this vowel in the input
Python 2.7
input = raw_input('Input a word: ').lower()
last_vowel = [a for a in input if a in "aeiou"][-1]
last_index = input.rfind(last_vowel)
print(last_vowel)
print(last_index)
Python 3.x
input = input('Input a word: ').lower()
last_vowel = [a for a in input if a in "aeiou"][-1]
last_index = input.rfind(last_vowel)
print(last_vowel)
print(last_index)

Function for counting words of i or more vowels in Python?

In the code below Question 13a asks me to have the function count how many vowels are in a string. (I don't have to call that function in my homework.) But I called it to test it out and that part is completely correct and it works. The string can be both uppercase and lowercase with NO punctuation.
Question 13b asks to create a dictionary. The key is the word in a string (the string has multiple words). The value is how many vowels in that individual word. The question is asking this: If the word has AT LEAST i amount of vowels, then append it to the dictionary (The word with the amount vowels) This function has two parameters. The first one is a string with NO punctuation. The second parameter represents the number of how many vowels the word MUST have to be appended to the dictionary. The professor wants me to call Function 13a this function as part of the algorithm. That being said, the output of Question 13a is the value of the key (the individual word) in this problem. I am having trouble with this question, because I just can't get Python to append the output of 13a (the number of vowels for a word) to the dictionary key.
And also in the code below, I did not work on the part yet where I was supposed use the variable i.
Here is my code:
print("Question 13a")
def vowelCount(s):
vowels = 'aeiou'
countVowels = 0
for letter in s.lower():
if letter in vowels:
countVowels += 1
print(countVowels)
print("Question 13b")
def manyVowels(t, i):
my_string = t.split()
my_dict = {}
for word in my_string:
number = vowelCount(word)
my_dict[word].append(number)
print(my_dict)
print(manyVowels('they are endowed by their creator with certain unalienable rights', 2))
If you cannot understand the question then here is the professor's directions:
Question 13a (10 points)
The letters a, e, i, o and u are vowels. No other letter is a vowel.
Write a function named vowelCount() that takes a string, s, as a parameter and returns the
number of vowels that s contains. The string s may contain both upper and lower case characters.
For example, the function call vowelCount('Amendment') should return the integer 3 because
there are 3 occurrences of the letters 'A' and 'e'.
Question 13b (10 points)
Write a function named manyVowels() that takes a body of text, t, and an integer, i, as
parameters. The text t contains only lower case letters and white space.
manyVowels() should return a dictionary in which the keys are all words in t that contain at least i
vowels. The value corresponding to each key is the number of vowels in it. For full credit,
manyVowels() must call the helper function vowelCount() from Question 11a to determine the
number of vowels in each word. For example, if the input text contains the word "hello", then
"hello" should be a key in the dictionary and its value should be 2 because there are 2 vowels in
"hello".
Input:
1. t, a text consisting of lower case letters and white space
2. i, a threshold number of vowels
Return: a dictionary of key-value pairs in which the keys are the words in t containing at least i
vowels and the value of each key is the number of vowels it contains.
For example, the following would be correct output.
text = 'they are endowed by their creator with certain unalienable rights'
print(manyVowels(text, 3))
{'certain': 3, 'unalienable': 6, 'creator': 3, 'endowed': 3}
Add a condition to add only words with enough vovels
def vowelCount(s):
vowels = 'aeiou'
countVowels = 0
for letter in s.lower():
if letter in vowels:
countVowels += 1
return countVowels
def manyVowels(t, i):
my_string = t.split()
my_dict = {}
for word in my_string:
number = vowelCount(word)
if number >= i:
my_dict[word] = number
return my_dict
The line my_dict[word] = number adds the resuld of vowelCount(word) to your dictionary. But only if the number of vovels is at least i.
def vowelCount(s):
num_vowels=0
for char in s:
if char in "aeiouAEIOU":
num_vowels = num_vowels+1
return num_vowels
def manyVowels(text, i):
words_with_many_vowels = dict()
text_array = text.split()
for word in text_array:
if vowelCount(word) >= i:
words_with_many_vowels[word] = vowelCount(word)
return words_with_many_vowels
print(vowelCount('Amendment'))
text = 'they are endowed by their creator with certain unalienable rights'
print(manyVowels(text, 3))
Output:
3
{'creator': 3, 'certain': 3, 'endowed': 3, 'unalienable': 6}
Try it here!
Your code needs some adjustments:
The first function should return a value not print it:
return (countVowels)
The second function is not adding the key with accompanying value to the dictionary correctly. You should use:
my_dict[word] = number
return {k:v for k, v in my_dict.items() if v > i}

Categories