Count the number of occurrences of a character in a string - python
How do I count the number of occurrences of a character in a string?
e.g. 'a' appears in 'Mary had a little lamb' 4 times.
str.count(sub[, start[, end]])
Return the number of non-overlapping occurrences of substring sub in the range [start, end]. Optional arguments start and end are interpreted as in slice notation.
>>> sentence = 'Mary had a little lamb'
>>> sentence.count('a')
4
You can use .count() :
>>> 'Mary had a little lamb'.count('a')
4
To get the counts of all letters, use collections.Counter:
>>> from collections import Counter
>>> counter = Counter("Mary had a little lamb")
>>> counter['a']
4
Regular expressions maybe?
import re
my_string = "Mary had a little lamb"
len(re.findall("a", my_string))
Python-3.x:
"aabc".count("a")
str.count(sub[, start[, end]])
Return the number of non-overlapping occurrences of substring sub in the range [start, end]. Optional arguments start and end are interpreted as in slice notation.
myString.count('a');
more info here
str.count(a) is the best solution to count a single character in a string. But if you need to count more characters you would have to read the whole string as many times as characters you want to count.
A better approach for this job would be:
from collections import defaultdict
text = 'Mary had a little lamb'
chars = defaultdict(int)
for char in text:
chars[char] += 1
So you'll have a dict that returns the number of occurrences of every letter in the string and 0 if it isn't present.
>>>chars['a']
4
>>>chars['x']
0
For a case insensitive counter you could override the mutator and accessor methods by subclassing defaultdict (base class' ones are read-only):
class CICounter(defaultdict):
def __getitem__(self, k):
return super().__getitem__(k.lower())
def __setitem__(self, k, v):
super().__setitem__(k.lower(), v)
chars = CICounter(int)
for char in text:
chars[char] += 1
>>>chars['a']
4
>>>chars['M']
2
>>>chars['x']
0
This easy and straight forward function might help:
def check_freq(x):
freq = {}
for c in set(x):
freq[c] = x.count(c)
return freq
check_freq("abbabcbdbabdbdbabababcbcbab")
{'a': 7, 'b': 14, 'c': 3, 'd': 3}
If a comprehension is desired:
def check_freq(x):
return {c: x.count(c) for c in set(x)}
Regular expressions are very useful if you want case-insensitivity (and of course all the power of regex).
my_string = "Mary had a little lamb"
# simplest solution, using count, is case-sensitive
my_string.count("m") # yields 1
import re
# case-sensitive with regex
len(re.findall("m", my_string))
# three ways to get case insensitivity - all yield 2
len(re.findall("(?i)m", my_string))
len(re.findall("m|M", my_string))
len(re.findall(re.compile("m",re.IGNORECASE), my_string))
Be aware that the regex version takes on the order of ten times as long to run, which will likely be an issue only if my_string is tremendously long, or the code is inside a deep loop.
I don't know about 'simplest' but simple comprehension could do:
>>> my_string = "Mary had a little lamb"
>>> sum(char == 'a' for char in my_string)
4
Taking advantage of built-in sum, generator comprehension and fact that bool is subclass of integer: how may times character is equal to 'a'.
a = 'have a nice day'
symbol = 'abcdefghijklmnopqrstuvwxyz'
for key in symbol:
print(key, a.count(key))
An alternative way to get all the character counts without using Counter(), count and regex
counts_dict = {}
for c in list(sentence):
if c not in counts_dict:
counts_dict[c] = 0
counts_dict[c] += 1
for key, value in counts_dict.items():
print(key, value)
I am a fan of the pandas library, in particular the value_counts() method. You could use it to count the occurrence of each character in your string:
>>> import pandas as pd
>>> phrase = "I love the pandas library and its `value_counts()` method"
>>> pd.Series(list(phrase)).value_counts()
8
a 5
e 4
t 4
o 3
n 3
s 3
d 3
l 3
u 2
i 2
r 2
v 2
` 2
h 2
p 1
b 1
I 1
m 1
( 1
y 1
_ 1
) 1
c 1
dtype: int64
count is definitely the most concise and efficient way of counting the occurrence of a character in a string but I tried to come up with a solution using lambda, something like this :
sentence = 'Mary had a little lamb'
sum(map(lambda x : 1 if 'a' in x else 0, sentence))
This will result in :
4
Also, there is one more advantage to this is if the sentence is a list of sub-strings containing same characters as above, then also this gives the correct result because of the use of in. Have a look :
sentence = ['M', 'ar', 'y', 'had', 'a', 'little', 'l', 'am', 'b']
sum(map(lambda x : 1 if 'a' in x else 0, sentence))
This also results in :
4
But Of-course this will work only when checking occurrence of single character such as 'a' in this particular case.
a = "I walked today,"
c=['d','e','f']
count=0
for i in a:
if str(i) in c:
count+=1
print(count)
I know the ask is to count a particular letter. I am writing here generic code without using any method.
sentence1 =" Mary had a little lamb"
count = {}
for i in sentence1:
if i in count:
count[i.lower()] = count[i.lower()] + 1
else:
count[i.lower()] = 1
print(count)
output
{' ': 5, 'm': 2, 'a': 4, 'r': 1, 'y': 1, 'h': 1, 'd': 1, 'l': 3, 'i': 1, 't': 2, 'e': 1, 'b': 1}
Now if you want any particular letter frequency, you can print like below.
print(count['m'])
2
the easiest way is to code in one line:
'Mary had a little lamb'.count("a")
but if you want can use this too:
sentence ='Mary had a little lamb'
count=0;
for letter in sentence :
if letter=="a":
count+=1
print (count)
To find the occurrence of characters in a sentence you may use the below code
Firstly, I have taken out the unique characters from the sentence and then I counted the occurrence of each character in the sentence these includes the occurrence of blank space too.
ab = set("Mary had a little lamb")
test_str = "Mary had a little lamb"
for i in ab:
counter = test_str.count(i)
if i == ' ':
i = 'Space'
print(counter, i)
Output of the above code is below.
1 : r ,
1 : h ,
1 : e ,
1 : M ,
4 : a ,
1 : b ,
1 : d ,
2 : t ,
3 : l ,
1 : i ,
4 : Space ,
1 : y ,
1 : m ,
"Without using count to find you want character in string" method.
import re
def count(s, ch):
pass
def main():
s = raw_input ("Enter strings what you like, for example, 'welcome': ")
ch = raw_input ("Enter you want count characters, but best result to find one character: " )
print ( len (re.findall ( ch, s ) ) )
main()
Python 3
Ther are two ways to achieve this:
1) With built-in function count()
sentence = 'Mary had a little lamb'
print(sentence.count('a'))`
2) Without using a function
sentence = 'Mary had a little lamb'
count = 0
for i in sentence:
if i == "a":
count = count + 1
print(count)
Use count:
sentence = 'A man walked up to a door'
print(sentence.count('a'))
# 4
Taking up a comment of this user:
import numpy as np
sample = 'samplestring'
np.unique(list(sample), return_counts=True)
Out:
(array(['a', 'e', 'g', 'i', 'l', 'm', 'n', 'p', 'r', 's', 't'], dtype='<U1'),
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1]))
Check 's'. You can filter this tuple of two arrays as follows:
a[1][a[0]=='s']
Side-note: It works like Counter() of the collections package, just in numpy, which you often import anyway. You could as well count the unique words in a list of words instead.
This is an extension of the accepted answer, should you look for the count of all the characters in the text.
# Objective: we will only count for non-empty characters
text = "count a character occurrence"
unique_letters = set(text)
result = dict((x, text.count(x)) for x in unique_letters if x.strip())
print(result)
# {'a': 3, 'c': 6, 'e': 3, 'u': 2, 'n': 2, 't': 2, 'r': 3, 'h': 1, 'o': 2}
No more than this IMHO - you can add the upper or lower methods
def count_letter_in_str(string,letter):
return string.count(letter)
You can use loop and dictionary.
def count_letter(text):
result = {}
for letter in text:
if letter not in result:
result[letter] = 0
result[letter] += 1
return result
spam = 'have a nice day'
var = 'd'
def count(spam, var):
found = 0
for key in spam:
if key == var:
found += 1
return found
count(spam, var)
print 'count %s is: %s ' %(var, count(spam, var))
Related
Using regular expression, list all the letters that follows a vowel according to their occurrence frequency
How can I find consonants letters that came after the vowels in words of string and count the frequency str = 'car regular double bad ' result19 = re.findall(r'\b\w*[aeiou][^ aeiou]\w*\b' , str) print(result19) #doesn't work Expected output letter r count = 2 letter b count = 1 letter d count = 1
I am not sure whether this is what you want or not, but it might help as an answer and not a comment. I think you are on the right track, but you need a few modifications and other lines to achieve the excepted: import re myStr = 'car regular double bad ' result19 = re.findall(r'[aeiou][^aeiou\s]+' , myStr) myDict = {} for value in result19: if not value[1] in myDict: myDict[value[1]] = 0 myDict[value[1]] += 1 myDict This will result in a dictionary containing the values and the number the have appeared: {'b': 1, 'd': 1, 'g': 1, 'l': 1, 'r': 2} For having a better output you can use a for loop to print each key and its value: for chr, value in myDict.items(): print(chr, "->", value) Output r -> 2 g -> 1 l -> 1 b -> 1 d -> 1
Your pattern \b\w*[aeiou][^ aeiou]\w*\b matches zero or more repetitions of a word character using \w* and only matches a single occurrence of [aeiou][^ aeiou] in the "word" If you want to match all consonant letters based on the alphabet a-z after a vowel, you can match a single occurrence of [aeiou] and use a capture group matching a single consonant. Then make use of re.findall to return a list of the group values. import re txt = 'car regular double bad ' lst = re.findall(r'[aeiou]([b-df-hj-np-tv-z])', txt) dct = {c: lst.count(c) for c in lst} print(dct) Output {'r': 2, 'g': 1, 'l': 1, 'b': 1, 'd': 1} If you want to match a non whitespace char other than a vowel after matching a vowel, you can use this pattern [aeiou]([^\saeiou]) Note that the l is also in the output as it comes after the u in ul
count words from list in another list in entry one
Hy, I want to count given phrases from a list in another list on position zero. list_given_atoms= ['C', 'Cl', 'Br'] list_of_molecules= ['C(B2Br)[Cl{H]Cl}P' ,'NAME'] When python find a match it should be safed in a dictionary like countdict = [ 'Cl : 2', 'C : 1', 'Br : 1'] i tried re.findall(r'\w+', list_of_molecules[0]) already but that resulsts in words like "B2Br", which is definitly not what i want. can someone help me?
[a-zA-Z]+ should be used instead of \w+ because \w+ will match both letters and numbers, while you are just looking for letters: import re list_given_atoms= ['C', 'Cl', 'Br'] list_of_molecules= ['C(B2Br)[Cl{H]Cl}P' ,'NAME'] molecules = re.findall('[a-zA-Z]+', list_of_molecules[0]) final_data = {i:molecules.count(i) for i in list_given_atoms} Output: {'C': 1, 'Br': 1, 'Cl': 2}
You could use something like this: >>> Counter(re.findall('|'.join(sorted(list_given_atoms, key=len, reverse=True)), list_of_molecules[0])) Counter({'Cl': 2, 'C': 1, 'Br': 1}) You have to sort the elements by their length, so 'Cl' matches before 'C'.
Short re.findall() solution: import re list_given_atoms = ['C', 'Cl', 'Br'] list_of_molecules = ['C(B2Br)[Cl{H]Cl}P' ,'NAME'] d = { a: len(re.findall(r'' + a + '(?=[^a-z]|$)', list_of_molecules[0], re.I)) for a in list_given_atoms } print(d) The output: {'C': 1, 'Cl': 2, 'Br': 1}
I tried your solutions and i figured out, that there are also several C after each other. So I came to this one here: for element in re.findall(r'([A-Z])([a-z|A-Z])?'. list_of_molecules[0]): if element[1].islower: counter = element[0] + element[1] if not (counter in counter_dict): counter_dict[counter] = 1 else: counter_dict[counter] += 1 The same way I checked for elements with just one case and added them to the dictionary. There is probably a better way.
You can't use a /w as a word character is equivalent to: [a-zA-Z0-9_] which clearly includes numbers so therefore "B2Br" is matched. You also can't just use the regex: [a-zA-Z]+ as that would produce one atom for something like "CO2"which should produce 2 separate molecules: C and 0. However the regex I came up with (regex101) just checks for a capital letter and then between 0 and 1 (so optional) lower case letter. Here it is: [A-Z][a-z]{0,1} and it will correctly produce the atoms. So to incorporate this into your original lists of: list_given_atoms= ['C', 'Cl', 'Br'] list_of_molecules= ['C(B2Br)[Cl{H]Cl}P' ,'NAME'] we want to first find all the atoms in list_of_molecules and then create a dictionary of the counts of the atoms in list_given_atoms. So to find all the atoms, we can use re.findall on the first element in the molecules list: atoms = re.findall("[A-Z][a-z]{0,1}", list_of_molecules[0]) which gives a list: ['C', 'B', 'Br', 'Cl', 'H', 'Cl', 'P'] then, to get the counts in a dictionary, we can use a dictionary-comprehension: counts = {a: atoms.count(a) for a in list_given_atoms} which gives the desired result of: {'Cl': 2, 'C': 1, 'Br': 1} And would also work when we have molecules like CO2 etc.
Given an input as 'sentence', how to return the element that appears the most
This first function counts the string's characters def character_count(sentence): characters = {} for char in sentence: if char in characters: characters[char] = characters[char] + 1 else: characters[char] = 1 return characters This second function determines the most common character and identifies which one appears most often by characters[char] which is established in the previous helper function def most_common_character(sentence): chars = character_count(sentence) most_common = "" max_times = 0 for curr_char in chars: if chars[curr_char] > max_times: most_common = curr_char max_times = chars[curr_char] return most_common
Why not simply using what Python provides? >>> from collections import Counter >>> sentence = "This is such a beautiful day, isn't it" >>> c = Counter(sentence).most_common(3) >>> c [(' ', 7), ('i', 5), ('s', 4)] After if you really want to proceed word by word and avoid spaces: >>> from collections import Counter >>> sentence = "This is such a beautiful day, isn't it" >>> res = Counter(sentence.replace(' ', '')) >>> res.most_common(1) [('i', 5)]
You actually don't have to change anything! Your code will work with a list as is (the variable names just become misleading). Try it: most_common_character(['this', 'is', 'a', 'a', 'list']) Output: 'a' This will work for lists with any kind of elements that are hashable (numbers, strings, characters, etc)
Counting Instances of Consecutive Duplicate Letters in a Python String
I'm trying to figure out how I can count the number of letters in a string that occur 3 times. The string is from raw_input(). For example, if my input is: abceeedtyooo The output should be: 2 This is my current code: print 'Enter String:' x = str(raw_input ("")) print x.count(x[0]*3)
To count number of consecutive duplicate letters that occur exactly 3 times: >>> from itertools import groupby >>> sum(len(list(dups)) == 3 for _, dups in groupby("abceeedtyooo")) 2
To count the chars in the string, you can use collections.Counter: >>> from collections import Counter >>> counter = Counter("abceeedtyooo") >>> print(counter) Counter({'e': 3, 'o': 3, 'a': 1, 'd': 1, 'y': 1, 'c': 1, 'b': 1, 't': 1}) Then you can filter the result as follows: >>> result = [char for char in counter if counter[char] == 3] >>> print(result) ['e', 'o'] If you want to match consecutive characters only, you can use regex (cf. re): >>> import re >>> result = re.findall(r"(.)\1\1", "abceeedtyooo") >>> print(result) ['e', 'o'] >>> result = re.findall(r"(.)\1\1", "abcaaa") >>> print(result) ['a'] This will also match if the same character appears three consecutive times multiple times (e.g. on "aaabcaaa", it will match 'a' twice). Matches are non-overlapping, so on "aaaa" it will only match once, but on "aaaaaa" it will match twice. Should you not want multiple matches on consecutive strings, modify the regex to r"(.)\1\1(?!\1)". To avoid matching any chars that appear more than 3 consecutive times, use (.)(?<!(?=\1)..)\1{2}(?!\1). This works around a problem with Python's regex module that cannot handle (?<!\1).
We can count the chars in the string through 'for' loop s="abbbaaaaaaccdaaab" st=[] count=0 for i in set(s): print(i+str(s.count(i)),end='') Output: a10c2b4d1
Counting word frequency and making a dictionary from it
This question already has answers here: How do I split a string into a list of words? (9 answers) Using a dictionary to count the items in a list (8 answers) Closed yesterday. I want to take every word from a text file, and count the word frequency in a dictionary. Example: 'this is the textfile, and it is used to take words and count' d = {'this': 1, 'is': 2, 'the': 1, ...} I am not that far, but I just can't see how to complete it. My code so far: import sys argv = sys.argv[1] data = open(argv) words = data.read() data.close() wordfreq = {} for i in words: #there should be a counter and somehow it must fill the dict.
If you don't want to use collections.Counter, you can write your own function: import sys filename = sys.argv[1] fp = open(filename) data = fp.read() words = data.split() fp.close() unwanted_chars = ".,-_ (and so on)" wordfreq = {} for raw_word in words: word = raw_word.strip(unwanted_chars) if word not in wordfreq: wordfreq[word] = 0 wordfreq[word] += 1 for finer things, look at regular expressions.
Although using Counter from the collections library as suggested by #Michael is a better approach, I am adding this answer just to improve your code. (I believe this will be a good answer for a new Python learner.) From the comment in your code it seems like you want to improve your code. And I think you are able to read the file content in words (while usually I avoid using read() function and use for line in file_descriptor: kind of code). As words is a string, in for loop, for i in words: the loop-variable i is not a word but a char. You are iterating over chars in the string instead of iterating over words in the string words. To understand this, notice following code snippet: >>> for i in "Hi, h r u?": ... print i ... H i , h r u ? >>> Because iterating over the given string char by chars instead of word by words is not what you wanted to achieve, to iterate words by words you should use the split method/function from string class in Python. str.split(str="", num=string.count(str)) method returns a list of all the words in the string, using str as the separator (splits on all whitespace if left unspecified), optionally limiting the number of splits to num. Notice the code examples below: Split: >>> "Hi, how are you?".split() ['Hi,', 'how', 'are', 'you?'] loop with split: >>> for i in "Hi, how are you?".split(): ... print i ... Hi, how are you? And it looks like something you need. Except for word Hi, because split(), by default, splits by whitespaces so Hi, is kept as a single string (and obviously) you don't want that. To count the frequency of words in the file, one good solution is to use regex. But first, to keep the answer simple I will be using replace() method. The method str.replace(old, new[, max]) returns a copy of the string in which the occurrences of old have been replaced with new, optionally restricting the number of replacements to max. Now check code example below to see what I suggested: >>> "Hi, how are you?".split() ['Hi,', 'how', 'are', 'you?'] # it has , with Hi >>> "Hi, how are you?".replace(',', ' ').split() ['Hi', 'how', 'are', 'you?'] # , replaced by space then split loop: >>> for word in "Hi, how are you?".replace(',', ' ').split(): ... print word ... Hi how are you? Now, how to count frequency: One way is use Counter as #Michael suggested, but to use your approach in which you want to start from empty an dict. Do something like this code sample below: words = f.read() wordfreq = {} for word in .replace(', ',' ').split(): wordfreq[word] = wordfreq.setdefault(word, 0) + 1 # ^^ add 1 to 0 or old value from dict What am I doing? Because initially wordfreq is empty you can't assign it to wordfreq[word] for the first time (it will raise key exception error). So I used setdefault dict method. dict.setdefault(key, default=None) is similar to get(), but will set dict[key]=default if key is not already in dict. So for the first time when a new word comes, I set it with 0 in dict using setdefault then add 1 and assign to the same dict. I have written an equivalent code using with open instead of single open. with open('~/Desktop/file') as f: words = f.read() wordfreq = {} for word in words.replace(',', ' ').split(): wordfreq[word] = wordfreq.setdefault(word, 0) + 1 print wordfreq That runs like this: $ cat file # file is this is the textfile, and it is used to take words and count $ python work.py # indented manually {'and': 2, 'count': 1, 'used': 1, 'this': 1, 'is': 2, 'it': 1, 'to': 1, 'take': 1, 'words': 1, 'the': 1, 'textfile': 1} Using re.split(pattern, string, maxsplit=0, flags=0) Just change the for loop: for i in re.split(r"[,\s]+", words):, that should produce the correct output. Edit: better to find all alphanumeric character because you may have more than one punctuation symbols. >>> re.findall(r'[\w]+', words) # manually indent output ['this', 'is', 'the', 'textfile', 'and', 'it', 'is', 'used', 'to', 'take', 'words', 'and', 'count'] use for loop as: for word in re.findall(r'[\w]+', words): How would I write code without using read(): File is: $ cat file This is the text file, and it is used to take words and count. And multiple Lines can be present in this file. It is also possible that Same words repeated in with capital letters. Code is: $ cat work.py import re wordfreq = {} with open('file') as f: for line in f: for word in re.findall(r'[\w]+', line.lower()): wordfreq[word] = wordfreq.setdefault(word, 0) + 1 print wordfreq Used lower() to convert an upper letter to lower letter. output: $python work.py # manually strip output {'and': 3, 'letters': 1, 'text': 1, 'is': 3, 'it': 2, 'file': 2, 'in': 2, 'also': 1, 'same': 1, 'to': 1, 'take': 1, 'capital': 1, 'be': 1, 'used': 1, 'multiple': 1, 'that': 1, 'possible': 1, 'repeated': 1, 'words': 2, 'with': 1, 'present': 1, 'count': 1, 'this': 2, 'lines': 1, 'can': 1, 'the': 1}
from collections import Counter t = 'this is the textfile, and it is used to take words and count' dict(Counter(t.split())) >>> {'and': 2, 'is': 2, 'count': 1, 'used': 1, 'this': 1, 'it': 1, 'to': 1, 'take': 1, 'words': 1, 'the': 1, 'textfile,': 1} Or better with removing punctuation before counting: dict(Counter(t.replace(',', '').replace('.', '').split())) >>> {'and': 2, 'is': 2, 'count': 1, 'used': 1, 'this': 1, 'it': 1, 'to': 1, 'take': 1, 'words': 1, 'the': 1, 'textfile': 1}
The following takes the string, splits it into a list with split(), for loops the list and counts the frequency of each item in the sentence with Python's count function count (). The words,i, and its frequency are placed as tuples in an empty list, ls, and then converted into key and value pairs with dict(). sentence = 'this is the textfile, and it is used to take words and count'.split() ls = [] for i in sentence: word_count = sentence.count(i) # Pythons count function, count() ls.append((i,word_count)) dict_ = dict(ls) print dict_ output; {'and': 2, 'count': 1, 'used': 1, 'this': 1, 'is': 2, 'it': 1, 'to': 1, 'take': 1, 'words': 1, 'the': 1, 'textfile,': 1}
sentence = "this is the textfile, and it is used to take words and count" # split the sentence into words. # iterate thorugh every word counter_dict = {} for word in sentence.lower().split(): # add the word into the counter_dict initalize with 0 if word not in counter_dict: counter_dict[word] = 0 # increase its count by 1 counter_dict[word] =+ 1
#open your text book,Counting word frequency File_obj=open("Counter.txt",'r') w_list=File_obj.read() print(w_list.split()) di=dict() for word in w_list.split(): if word in di: di[word]=di[word] + 1 else: di[word]=1 max_count=max(di.values()) largest=-1 maxusedword='' for k,v in di.items(): print(k,v) if v>largest: largest=v maxusedword=k print(maxusedword,largest)
you can also use default dictionaries with int type. from collections import defaultdict wordDict = defaultdict(int) text = 'this is the textfile, and it is used to take words and count'.split(" ") for word in text: wordDict[word]+=1 explanation: we initialize a default dictionary whose values are of the type int. This way the default value for any key will be 0 and we don't need to check if a key is present in the dictionary or not. we then split the text with the spaces into a list of words. then we iterate through the list and increment the count of the word's count.
wordList = 'this is the textfile, and it is used to take words and count'.split() wordFreq = {} # Logic: word not in the dict, give it a value of 1. if key already present, +1. for word in wordList: if word not in wordFreq: wordFreq[word] = 1 else: wordFreq[word] += 1 print(wordFreq)
My approach is to do few things from ground: Remove punctuations from the text input. Make list of words. Remove empty strings. Iterate through list. Make each new word a key into Dictionary with value 1. If a word is already exist as key then increment it's value by one. text = '''this is the textfile, and it is used to take words and count''' word = '' #This will hold each word wordList = [] #This will be collection of words for ch in text: #traversing through the text character by character #if character is between a-z or A-Z or 0-9 then it's valid character and add to word string.. if (ch >= 'a' and ch <= 'z') or (ch >= 'A' and ch <= 'Z') or (ch >= '0' and ch <= '9'): word += ch elif ch == ' ': #if character is equal to single space means it's a separator wordList.append(word) # append the word in list word = '' #empty the word to collect the next word wordList.append(word) #the last word to append in list as loop ended before adding it to list print(wordList) wordCountDict = {} #empty dictionary which will hold the word count for word in wordList: #traverse through the word list if wordCountDict.get(word.lower(), 0) == 0: #if word doesn't exist then make an entry into dic with value 1 wordCountDict[word.lower()] = 1 else: #if word exist then increament the value by one wordCountDict[word.lower()] = wordCountDict[word.lower()] + 1 print(wordCountDict) Another approach: text = '''this is the textfile, and it is used to take words and count''' for ch in '.\'!")(,;:?-\n': text = text.replace(ch, ' ') wordsArray = text.split(' ') wordDict = {} for word in wordsArray: if len(word) == 0: continue else: wordDict[word.lower()] = wordDict.get(word.lower(), 0) + 1 print(wordDict)
One more function: def wcount(filename): counts = dict() with open(filename) as file: a = file.read().split() # words = [b.rstrip() for b in a] for word in a: if word in counts: counts[word] += 1 else: counts[word] = 1 return counts
def play_with_words(input): input_split = input.split(",") input_split.sort() count = {} for i in input_split: if i in count: count[i] += 1 else: count[i] = 1 return count input ="i,am,here,where,u,are" print(play_with_words(input))
Write a Python program to create a list of strings by taking input from the user and then create a dictionary containing each string along with their frequencies. (e.g. if the list is [‘apple’, ‘banana’, ‘fig’, ‘apple’, ‘fig’, ‘banana’, ‘grapes’, ‘fig’, ‘grapes’, ‘apple’] then output should be {'apple': 3, 'banana': 2, 'fig': 3, 'grapes': 2}. lst = [] d = dict() print("ENTER ZERO NUMBER FOR EXIT !!!!!!!!!!!!") while True: user = input('enter string element :: -- ') if user == "0": break else: lst.append(user) print("LIST ELEMENR ARE :: ",lst) l = len(lst) for i in range(l) : c = 0 for j in range(l) : if lst[i] == lst[j ]: c += 1 d[lst[i]] = c print("dictionary is :: ",d)
You can also go with this approach. But you need to store the text file's content in a variable as a string first after reading the file. In this way, You don't need to use or import any external libraries. s = "this is the textfile, and it is used to take words and count" s = s.split(" ") d = dict() for i in s: c = "" if i.isalpha() == True: if i not in d: d[i] = 1 else: d[i] += 1 else: for j in i: l = len(j) if j.isalpha() == True: c+=j if c not in d: d[c] = 1 else: d[c] += 1 print(d) Result: