finding duplicates in a string at python 3 - python

def find_duplicate():
x =input("Enter a word = ")
for char in x :
counts=x.count(char)
while counts > 1:
return print(char,counts)
I've got small problem in there i want to find all duplicates in string but this program give me only one duplicate ex: aassdd is my input function gave me only a : 2 but it need to be in that form a : 2 s : 2 d : 2 thanks for your answers.

return is a keyword that works more or less as immediately exit this function (and optionally carry some output with you). You thus need to remove the return statement:
def find_duplicate():
x =input("Enter a word = ")
for char in x :
counts=x.count(char)
print(char,counts)
Furthermore you also have to remove the while loop (or update the counter if you want to print multiple times), otherwise you will get stuck in an infinite loop since count is not updated and the test will thus always succeed.
Mind however that in this case a will be printed multiple times (in this case two) if it is found multiple times in the string. You can solve this issue by first constructing a set of the characters in the string and iterate over this set:
def find_duplicate():
x =input("Enter a word = ")
for char in set(x):
counts=x.count(char)
print(char,counts)
Finally it is better to make a separation between functions that calculate and functions that do I/O (for instance print). So you better make a function that returns a dictionary with the counts, and one that prints that dictionary. You can generate a dictionary like:
def find_duplicate(x):
result = {}
for char in set(x):
result[char]=x.count(char)
return result
And a calling function:
def do_find_duplicates(x):
x =input("Enter a word = ")
for key,val in find_duplicate(x).items():
print(key,val)
And now the best part is: you actually do not need to write the find_duplicate function: there is a utility class for that: Counter:
from collections import Counter
def do_find_duplicates(x):
x =input("Enter a word = ")
for key,val in Counter(x).items():
print(key,val)

This will help you.
def find_duplicate():
x = input("Enter a word = ")
for char in set(x):
counts = x.count(char)
while counts > 1:
print(char, ":", counts, end=' ')
break
find_duplicate()

Just because this is fun, a solution that leverages the built-ins to avoid writing any more custom code than absolutely needed:
from collections import Counter, OrderedDict
# To let you count characters while preserving order of first appearance
class OrderedCounter(Counter, OrderedDict): pass
def find_duplicate(word):
return [(ch, cnt) for ch, cnt in OrderedCounter(word).items() if cnt > 1]
It's likely more efficient (it doesn't recount each character over and over), only reports each character once, and uses arguments and return values instead of input and print, so it's more versatile (your main method can prompt for input and print output if it chooses).
Usage is simple (and thanks to OrderedCounter, it preserves order of first appearance in the original string too):
>>> find_duplicate('aaacfdedbfrf')
[('a', 3), ('f', 3), ('d', 2)]

def find_duplicate():
x = input("Enter a word = ")
dup_letters = []
dup_num = []
for char in x:
if char not in dup_letters and x.count(char) > 1:
dup_letters.append(char)
dup_num.append(x.count(char))
return zip(dup_letters, dup_num)
dup = find_duplicate()
for i in dup:
print(i)

This version should be fast as I am not using any library or more than one cycle, do you have any faster options?
import datetime
start_time = datetime.datetime.now()
some_string = 'Laptop' * 99999
ans_dict = {}
for i in some_string:
if i in ans_dict:
ans_dict[i] += 1
else:
ans_dict[i] = 1
print(ans_dict)
end_time = datetime.datetime.now()
print(end_time - start_time)

def find_duplicate():
x = input("Enter a word = ")
y = ""
check = ""
for char in x:
if x.count(char) > 1 and char not in y and char != check:
y += (char + ":" + str(x.count(char)) + " ")
check = char
return y.strip()

Related

Python: count string and output letters that match a number

I want to take any string and have the user input a number. the output should then be the letters that appear as many times as that number. For example, if the user inputs "apple" and the number is 2 then the output should be "p". any advice? as far as I've gotten is being able to count the letters
You could make use of the set() function to get all the unique characters, iterate through the resultant set, and match the character count for each of the values retrieved. You can use the following code to achieve the desired output.
userInput = input('Enter a string: ')
matchNumValue = int(input('Enter a number: '))
matchingCharacters = [charValue for charValue in list(set(userInput)) if userInput.count(charValue) == matchNumValue]
print(matchingCharacters)
Hope this helps! 😊
You can use the count method.
Here an example:
word = input('Enter a string: ')
number = int(input('Enter a number: '))
usedLetters = []
for letter in word:
if letter not in usedLetters:
n = word.count(letter)
usedLetters.append(letter)
if n == number:
print(n)
The output will be:
2
Most intuitive without any extra libraries is just to use a dict to keep track of the number of occurrences of each letter. Then iterate through to see which ones have the correct number of occurrences.
def countString(string, num):
counter = {}
res = []
for char in string:
if char in counter.keys():
counter[char] += 1
else:
counter[char] = 1
for k,v in counter.items():
if v == num:
res.append(k)
return res
print(countString('apple', 2))
You could use a collections.Counter to count the characters in the string and then reverse it to a dictionary mapping counts to a list of characters with that count. collection.defaultdict creates new key/value pairs for you, to keep the line count down.
import collections
def count_finder(string, count):
counts = collections.defaultdict(list)
for char,cnt in collections.Counter(string).items():
counts[cnt].append(char)
return counts.get(count, [])
#If you don't want to import anything just use this code.
a = int(input('Enter the number'))
b='banana'
l=[]
for i in b:
if a==b.count(i):
l.append(i)
else:
pass
print(l.pop())

Python code to count the recurrence of a letter in a word

I need your help to calculate the recurrence of a letter in the word.
Input (string): HelloWorld
Output: H1e1l3o2W1r1d1
You need a run-length-encoding algorithm on the input.
GeeksforGeeks has a great article on this:
https://www.geeksforgeeks.org/run-length-encoding-python/
# Python code for run length encoding
from collections import OrderedDict
def runLengthEncoding(input):
# Generate ordered dictionary of all lower
# case alphabets, its output will be
# dict = {'w':0, 'a':0, 'd':0, 'e':0, 'x':0}
dict=OrderedDict.fromkeys(input, 0)
# Now iterate through input string to calculate
# frequency of each character, its output will be
# dict = {'w':4,'a':3,'d':1,'e':1,'x':6}
for ch in input:
dict[ch] += 1
# now iterate through dictionary to make
# output string from (key,value) pairs
output = ''
for key,value in dict.items():
output = output + key + str(value)
return output
# Driver function
if __name__ == "__main__":
input="wwwwaaadexxxxxx"
print (runLengthEncoding(input))
Output:
'w4a3d1e1x6'
Your Example:
input = 'hello world'
print(runLengthEncoding(input))
Output:
'h1e1l3o2 1w1r1d1'
Exactly how you wanted it.
Above code from GeeksforGeeks link.
As mentioned by others, you can use str.count(). One easy approach is to look at the first letter, count it, then delete all instances of it from the string and repeat. A simple recursive answer might look like:
def count(word):
if len(word) == 0:
return ""
return word[0]+str(word.count(word[0]))+count(word[1:].replace(word[0], ""))
use string.count().
The syntax is as follows :
string.count(substring, [start_index],[end_index])
substring is the letter you are trying to find, [start_index] is the letter at which to start searching (remember, python starts at 0 when using indexes) and [end_index] is at which letter to stop searching.
I think this function should do the trick:
def countoccurences(word, character):
occuresin =[]
for letter in word:
if letter == character:
occuresin.append(letter)
print("Letter", character, " occurs in string: ", str(len(occuresin)), " times.")
return len(occuresin)
countoccurences("1se3sr4g45h7e5q3e", 'e')

Counting consecutive repeats in a string and returning a value in python

Asked my friend to give me an assignment for me to practice. It is:
If a user enters a string "AAABNNNNNNDJSSSJENDDKEW" the program will return
"3AB6NDJ2SJEN2DKEW" and vice versa.
This what I tried so far:
from collections import Counter
list_user_input =[]
list_converted_output=[]
current_char = 0 #specifies the char it is reading
next_char = 1
cycle = 0 # counts number of loops
char_repeat = 1
prev_char=""
count = 1
user_input = input("Enter your string: ")
user_input_strip = user_input.strip()
user_input_striped_replace = user_input_strip.replace(" ", "").lower()
list_user_input.append(user_input_striped_replace[0:len(user_input_striped_replace)])
print(list_user_input)
print(user_input_striped_replace)
I have "cleaned" the code so it removes white spaces and keeps it in low cap
Here is where I am stuck - the logics. I was thinking to go the through the string one index at a time and compare the next on to the other. Is this the wright way to go about it? And I'm not even sure about the loop construction.
#counter = Counter(list_user_input)
#print(counter)
#while cycle <= len(user_input_striped_replace):
for letter in user_input_striped_replace:
cycle+=1
print("index nr {}, letter: ".format(current_char)+letter +" and cycle : " + str(cycle))
current_char+=1
if letter[0:1] == letter[1:2]:
print("match")
print("index nr {}, letter: ".format(current_char)+letter +" and cycle : " + str(cycle))
current_char+=1
Counter is a good choice for such task but about the rest you can use sorted to sort the items of Counter then use a list comprehension to create the desire list then concatenate with join :
>>> from collections import Counter
>>> c=Counter(s)
>>> sor=sorted(c.items(),key=lambda x:s.index(x[0]))
>>> ''.join([i if j==1 else '{}{}'.format(j,i) for i,j in sor])
'3AB7N3D2J3S2EKW'
I'd do it with regular expressions. Have a look at those.
Spoiler:
import re
def encode(s):
return re.sub(r'(.)\1+', lambda m: str(len(m.group(0)))+m.group(1), s)
def decode(e):
return re.sub('(\d+)(.)', lambda m: int(m.group(1))*m.group(2), e)
s = "AAABNNNNNNDJSSSJENDDKEW"
e = encode(s)
print(e, decode(e) == s)
Prints:
3AB6NDJ3SJEN2DKEW True
Your "and vice versa" sentence sounds like the program needs to detect itself whether to encode or to decode, so here's that (proof of correctness left as an exercise :-)
def switch(s):
e = re.sub(r'(\D)\1+', lambda m: str(len(m.group(0)))+m.group(1), s)
d = re.sub('(\d+)(.)', lambda m: int(m.group(1))*m.group(2), s)
return e if e != s else d

Why is this not correct? (codeeval challenge)PYTHON

This is what I have to do https://www.codeeval.com/open_challenges/140/
I've been on this challenge for three days, please help. It it is 85-90 partially solved. But not 100% solved... why?
This is my code:
import sys
test_cases = open(sys.argv[1], 'r')
for test in test_cases:
saver=[]
text=""
textList=[]
positionList=[]
num=0
exists=int()
counter=0
for l in test.strip().split(";"):
saver.append(l)
for i in saver[0].split(" "):
textList.append(i)
for j in saver[1].split(" "):
positionList.append(j)
for i in range(0,len(positionList)):
positionList[i]=int(positionList[i])
accomodator=[None]*len(textList)
for n in range(1,len(textList)):
if n not in positionList:
accomodator[n]=textList[len(textList)-1]
exists=n
for item in positionList:
accomodator[item-1]=textList[counter]
counter+=1
if counter>item:
accomodator[exists-1]=textList[counter]
for word in accomodator:
text+=str(word) + " "
print text
test_cases.close()
This code works for me:
import sys
def main(name_file):
_file = open(name_file, 'r')
text = ""
while True:
try:
line = _file.next()
disordered_line, numbers_string = line.split(';')
numbers_list = map(int, numbers_string.strip().split(' '))
missing_number = sum(xrange(sorted(numbers_list)[0],sorted(numbers_list)[-1]+1)) - sum(numbers_list)
if missing_number == 0:
missing_number = len(disordered_line)
numbers_list.append(missing_number)
disordered_list = disordered_line.split(' ')
string_position = zip(disordered_list, numbers_list)
ordered = sorted(string_position, key = lambda x: x[1])
text += " ".join([x[0] for x in ordered])
text += "\n"
except StopIteration:
break
_file.close()
print text.strip()
if __name__ == '__main__':
main(sys.argv[1])
I'll try to explain my code step by step so maybe you can see the difference between your code and mine one:
while True
A loop that breaks when there are no more lines.
try:
I put the code inside a try and catch the StopIteracion exception, because this is raised when there are no more items in a generator.
line = _file.next()
Use a generator, so that way you do not put all the lines in memory from once.
disordered_line, numbers_string = line.split(';')
Get the unordered phrase and the numbers of every string's position.
numbers_list = map(int, numbers_string.strip().split(' '))
Convert every number from string to int
missing_number = sum(xrange(sorted(numbers_list)[0],sorted(numbers_list)[-1]+1)) - sum(numbers_list)
Get the missing number from the serial of numbers, so that missing number is the position of the last string in the phrase.
if missing_number == 0:
missing_number = len(unorder_line)
Check if the missing number is equal to 0 if so then the really missing number is equal to the number of the strings that make the phrase.
numbers_list.append(missing_number)
Append the missing number to the list of numbers.
disordered_list = disordered_line.split(' ')
Conver the disordered phrase into a list.
string_position = zip(disordered_list, numbers_list)
Combine every string with its respective position.
ordered = sorted(string_position, key = lambda x: x[1])
Order the combined list by the position of the string.
text += " ".join([x[0] for x in ordered])
Concatenate the ordered phrase, and the reamining code it's easy to understand.
UPDATE
By looking at your code here is my opinion tha might solve your problem.
split already returns a list so you do not have to loop over the splitted content to add that content to another list.
So these six lines:
for l in test.strip().split(";"):
saver.append(l)
for i in saver[0].split(" "):
textList.append(i)
for j in saver[1].split(" "):
positionList.append(j)
can be converted into three:
splitted_test = test.strip().split(';')
textList = splitted_test[0].split(" ")
positionList = map(int, splitted_test[1].split(" "))
In this line positionList = map(int, splitted_test[0].split(" ")) You already convert numbers into int, so you save these two lines:
for i in range(0,len(positionList)):
positionList[i]=int(positionList[i])
The next lines:
accomodator=[None]*len(textList)
for n in range(1,len(textList)):
if n not in positionList:
accomodator[n]=textList[len(textList)-1]
exists=n
can be converted into the next four:
missing_number = sum(xrange(sorted(positionList)[0],sorted(positionList)[-1]+1)) - sum(positionList)
if missing_number == 0:
missing_number = len(textList)
positionList.append(missing_number)
Basically what these lines do is calculate the missing number in the serie of numbers so the len of the serie is the same as textList.
The next lines:
for item in positionList:
accomodator[item-1]=textList[counter]
counter+=1
if counter>item:
accomodator[exists-1]=textList[counter]
for word in accomodator:
text+=str(word) + " "
Can be replaced by these ones:
string_position = zip(textList, positionList)
ordered = sorted(string_position, key = lambda x: x[1])
text += " ".join([x[0] for x in ordered])
text += "\n"
From this way you can save, lines and memory, also use xrange instead of range.
Maybe the factors that make your code pass partially could be:
Number of lines of the script
Number of time your script takes.
Number of memory your script uses.
What you could do is:
Use Generators. #You save memory
Reduce for's, this way you save lines of code and time.
If you think something could be made it easier, do it.
Do not redo the wheel, if something has been already made it, use it.

Can't get my count function to work in Python

I'm trying to create a function where you can put in a phrase such as "ana" in the word "banana", and count how many times it finds the phrase in the word. I can't find the error I'm making for some of my test units not to work.
def test(actual, expected):
""" Compare the actual to the expected value,
and print a suitable message.
"""
import sys
linenum = sys._getframe(1).f_lineno # get the caller's line number.
if (expected == actual):
msg = "Test on line {0} passed.".format(linenum)
else:
msg = ("Test on line {0} failed. Expected '{1}', but got '{2}'.".format(linenum, expected, actual))
print(msg)
def count(phrase, word):
count1 = 0
num_phrase = len(phrase)
num_letters = len(word)
for i in range(num_letters):
for x in word[i:i+num_phrase]:
if phrase in word:
count1 += 1
else:
continue
return count1
def test_suite():
test(count('is', 'Mississippi'), 2)
test(count('an', 'banana'), 2)
test(count('ana', 'banana'), 2)
test(count('nana', 'banana'), 1)
test(count('nanan', 'banana'), 0)
test(count('aaa', 'aaaaaa'), 4)
test_suite()
Changing your count function to the following passes the tests:
def count(phrase, word):
count1 = 0
num_phrase = len(phrase)
num_letters = len(word)
for i in range(num_letters):
if word[i:i+num_phrase] == phrase:
count1 += 1
return count1
Use str.count(substring). This will return how many times the substring occurs in the full string (str).
Here is an interactive session showing how it works:
>>> 'Mississippi'.count('is')
2
>>> 'banana'.count('an')
2
>>> 'banana'.count('ana')
1
>>> 'banana'.count('nana')
1
>>> 'banana'.count('nanan')
0
>>> 'aaaaaa'.count('aaa')
2
>>>
As you can see, the function is non-overlapping. If you need overlapping behaviour, look here: string count with overlapping occurrences
You're using the iteration wrong, so:
for i in range(num_letters): #This will go from 1, 2, ---> len(word)
for x in word[i:i+num_phrase]:
#This will give you the letters starting from word[i] to [i_num_phrase]
#but one by one, so : for i in 'dada': will give you 'd' 'a' 'd' 'a'
if phrase in word: #This condition doesnt make sense in your problem,
#if it's true it will hold true trough all the
#iteration and count will be
#len(word) * num_phrase,
#and if it's false it will return 0
count1 += 1
else:
continue
I guess, str.count(substring) is wrong solution, because it doesn't count overlapping substrings and test suite fails.
There is also builtin str.find method, which could be helpful for the task.
Another way :
def count(sequence,item) :
count = 0
for x in sequence :
if x == item :
count = count+1
return count
A basic question rais this times.
when u see a string like "isisisisisi" howmany "isi" do u count?
at first state you see the string "isi s isi s isi" and return 3 as count.
at the second state you see the string "isisisisisi" and counts the "i" tow times per phrase like this "isi isi isi isi isi".
In other word second 'i' is last character of first 'isi' and first character of second 'isi'.
so you have to return 5 as count.
for first state simply can use:
>>> string = "isisisisisi"
>>> string.count("isi")
3
and for second state you have to recognize the "phrase"+"anything"+"phrase" in the search keyword.
the below function can do it:
def find_iterate(Str):
i = 1
cnt = 0
while Str[i-1] == Str[-i] and i < len(Str)/2:
i += 1
cnt += 1
return Str[0:cnt+1]
Now you have many choice to count the search keyword in the string.
for example I do such below:
if __name__ == "__main__":
search_keyword = "isi"
String = "isisisisisi"
itterated_part = find_iterate(search_keyword)
c = 0
while search_keyword in String:
c += String.count(search_keyword)
String = String.replace(search_keyword, itterated_part)
print c
I do not know if a better way be in python.but I tried to do this with help of Regular Expressions but found no way.

Categories