python count repeating characters in a string by using dictionary function - python

I have a string
string = 'AAA'
When using the string.count('A') the output is equal to 3
and if it is string.count('AA') the output is equal to 1
However, there are 2 'AA's in the string.
Is there any method to count repeated string like above by using dictionary function?
I'd like to hear your helpful suggestions.
Thank you all in advance.

The problem is Count return the number of (non-overlapping) occurrences of substring sub in string.
try this as you can see at this post:
def occurrences(string, sub):
count = start = 0
while True:
start = string.find(sub, start) + 1
if start > 0:
count+=1
else:
return count

Alternative for "not huge" strings
>>> s, sub = 'AAA', 'AA'
>>> sum(s[x:].startswith(sub) for x in range(len(s)))
2
I find this a little more readable.

Yeah! You can use a dictionary
def count(input):
a = {}
for i in input:
a[i] = a.get(i,0)+ 1
return a
print(count('AA')) #Would return 2

Related

call a method only if the previous one returned a value

I am looking for a specific string in text, and want to get it only if found.
I can write it as followד and it works:
if re.search("Counter=(\d+)", line):
total = int(re.search("Counter=(\d+)", line).group(1))
else:
total = 1
But I remember another nicer way to write it with question mark before the "group", and put value "1" as default value in case of re.search return an empty result.
Assuming you are using Python 3.8 or later, you can assign to the output of re.search in the if statement with the walrus-operator :=.
if match := re.search("Counter=(\d+)", line):
total = int(match.group(1))
else:
total = 1
If you want a oneliner:
total = int(match.group(1)) if (match := re.search("Counter=(\d+)", line)) else 1
When using re.findall it will return a list with groups or empty.
Then you can conditionally assign based on this list using a ternary expression:
counters = re.findall("Counter=(\d+)", line)
total = counters[0] if counters else 1
(This will also work in Python before 3.8.)
Test with both cases:
counters = re.findall("Counter=(\d+)", 'Hello World')
total = counters[0] if counters else 1
print(total)
# 1
counters = re.findall("Counter=(\d+)", 'Counter=100')
total = counters[0] if counters else 1
print(total)
# 100
You could match the empty end and replace by 1 then:
total = int(re.search("Counter=(\d+)|$", line)[1] or 1)
Try it online!

How to count number of duplicates of item in string?

Suppose I have an string: s = "hello2020"
How can I create a program that returns the number of duplicates in the string? In this instance, the program would return 3 as the letter "l" appears more than once, and the numbers "2" and "0" also appear more than once.
Thanks in advance.
EDIT: So far, I have tried: return len([x for x in set(s) if s.count(x) > 1]), but it fails two of the testcases. Therefore, I am looking for an alternative solution.
from collections import Counter
def do_find_duplicates(x):
dup_chars = 0
for key,val in Counter(x).items():
if val > 1: dup_chars += 1
print(dup_chars)
do_find_duplicates('hello2020')
One Liner Solution, convert string to set then subtract length of string by converted set string
def duplicate_count(string):
return len(string) - len(set(string))
print(duplicate_count("hello2020"))
#3

No. of times a sub-string occurs in the string

Below is my code:
def count_substring(string, sub_string):
counter = 0
for x in range(0,len(string)):
if string[x]+string[x+1]+string[x+2] == sub_string:
counter +=1
return counter
When I run the code it throws an error - "IndexError: string index out of range"
Please help me in understanding what is wrong with my code and also with the solution.
I am a beginner in Python. Please explain this to me like I am 5.
Can't you simple use str.count for non-overlapping matches:
str.count(substring, [start_index], [end_index])
full_str = 'Test for substring, check for word check'
sub_str = 'check'
print(full_str.count(sub_str))
Returns 2
If you have overlapping matches of your substring you could try re.findall with a positive lookahead:
import re
full_str = 'bobob'
sub_str = 'bob'
print(len(re.findall('(?='+sub_str+')',full_str)))
If you got the new regex.findall module and you want to count as such, try to use the overlapping parameter in re.findall and set it to true:
import regex as re
full_str = 'bobob'
sub_str = 'bob'
print(len(re.findall(sub_str, full_str, overlapped=True)))
Both options will return: 2
Couldn't you just use count? It uses way less code. See JvdV's answer. Also, by the way, this is how I can do it:
def count_substring(string, substring)
print(string.count(substring))
This simplifies code by a lot and also you could just get rid of the function entirely and do this:
print(string.count(substring)) # by the way you have to define string and substring first
If you want to include overlapping strings, then do this:
def count(string, substring):
string_size = len(string)
substring_size = len(substring)
count = 0
for i in xrange(0, string_size-substring_size+1):
if string[ i:i + substring_size] == substring:
count += 1
return count
String has built-in method count for this purpose.
string = 'This is the way to do it.'
string.count('is')
Output: 2

Return Alternating Letters With the Same Length From two Strings

there was a similar question asked on here but they wanted the remaining letters returned if one word was longer. I'm trying to return the same number of characters for both strings.
Here's my code:
def one_each(st, dum):
total = ""
for i in (st, dm):
total += i
return total
x = one_each("bofa", "BOFAAAA")
print(x)
It doesn't work but I'm trying to get this desired output:
>>>bBoOfFaA
How would I go about solving this? Thank you!
str.join with zip is possible, since zip only iterates pairwise up to the shortest iterable. You can combine with itertools.chain to flatten an iterable of tuples:
from itertools import chain
def one_each(st, dum):
return ''.join(chain.from_iterable(zip(st, dum)))
x = one_each("bofa", "BOFAAAA")
print(x)
bBoOfFaA
I'd probably do something like this
s1 = "abc"
s2 = "123"
ret = "".join(a+b for a,b in zip(s1, s2))
print (ret)
Here's a short way of doing it.
def one_each(short, long):
if len(short) > len(long):
short, long = long, short # Swap if the input is in incorrect order
index = 0
new_string = ""
for character in short:
new_string += character + long[index]
index += 1
return new_string
x = one_each("bofa", "BOFAAAA") # returns bBoOfFaA
print(x)
It might show wrong results when you enter x = one_each("abcdefghij", "ABCD") i.e when the small letters are longer than capital letters, but that can be easily fixed if you alter the case of each letter of the output.

How to find number or 'xx' pairs in a barcode recursively (python)

I am using python.
Im having trouble with this recursion problem, I am trying to find how many pairs of characters are the same in a string. For example, 'xx' would return 1 and 'xxx' would also return one because the pairs are not allowed to overlap. 'aabbb' would return 2.
I am completely stuck. I thought of breaking the word up into length 2 strings and recursing through the string like that, but then cases like 'aaa' would result in incorrect output.
Thanks.
Not sure why you want to do this recursively. If you wish to avoid regex, you can still just scan the string from left to right. For example, using itertools.groupby
>>> from itertools import groupby
>>> s = 'aabbb'
>>> sum(sum(1 for i in g)//2 for k,g in groupby(s))
2
>>> s = 'yyourr ssstringg'
>>> sum(sum(1 for i in g)//2 for k,g in groupby(s))
4
sum(1 for i in g) is used to find the length of the group. If the groups are not very long you can use len(list(g)) instead
You can use regex for that:
import re
s = 'yyourr ssstringg'
print len(re.findall(r'(\w)\1', s))
[OUTPUT]
4
This also takes care of your "overlaps-not-allowed" problem as you can see in the above example it prints 4 and not 5.
For a recursion approach, you can do it as:
st = 'yyourr ssstringg'
def get_double(s):
if len(s) < 2:
return 0
else:
for i,k in enumerate(s):
if k==s[i+1]:
return 1 + get_double(s[i+2:])
>>> print get_double(st)
4
And without a for loop:
st = 'yyourr sstringg'
def get_double(s):
if len(s) < 2:
return 0
elif s[0]==s[1]:
return 1 + get_double(s[2:])
else:
return 0 + get_double(s[1:])
>>> print get_double(st)
4
I would evaluate it by 2's.
for example "sskkkj" would be looked at as two sets of two char strings:
"ss", "kk", "kj" # from 0 index
"sk", "kk" # offset by 1
look at the two sets at the same time and add only one to the count if either has a pair.

Categories