Duplicate characters in a string - python

Hello I a beginner in python. I am building a small program that can find any duplicate characters in a string. However there's something i don't understand.
Code:
def is_isogram(string):
dict = {}
for letter in string:
dict[letter] = 1
if letter in dict:
dict[letter] += 1
return dict
print(is_isogram("Dermatoglyphics"))
OUTPUT
{'D': 1, 'e': 1, 'r': 1, 'm': 1, 'a': 1, 't': 1, 'o': 1, 'g': 1, 'l': 1, 'y': 1, 'p': 1, 'h': 1, 'i': 1, 'c': 1, 's': 2}
I set an empty dictionary.
I then used a for loop to iterate over the string, and then in each iteration it should assign 1 to a dictionary key, "letter"
Then used "if...in" to check if letter has already appeared, and if it has then the the "letter" key should be incremented by 1.
I tried it on a word, Dermatoglyphics, but each time the last key value pair is always 2, even though this word only contains 1 of each letter. Does anyone know why?

if statement applies after finishing for loop, so that it adds 1 only in last character. Its a problem of indentation. Even if you write if condition inside loop, it won't be right because of your logic. You assign dict[letter] = 1 for every letter. Then check if letter in dict, so that it will add 1 two times. Use else condition instead.
def is_isogram(string):
dict = {}
for letter in string:
if letter in dict:
dict[letter] += 1
else:
dict[letter] = 1
return dict
print(is_isogram("Dermatoglyphics"))
Or you can use count function like this
def is_isogram(string):
dict = {}
for letter in string:
dict[letter] = string.count(letter)
return dict
print(is_isogram("Dermatoglyphics"))

You are setting 1 for each, and then you increment the last letter. I think you meant to put if inside for block.
Here is a working version:
def is_isogram(string):
dct = {}
for letter in string:
if letter in dct:
dct[letter] += 1
else:
dct[letter] = 1
return dct
print(is_isogram("Dermatoglyphics"))
The logic behind: If the letter already exists, increment counter. Otherwise initialize it with counter=1.
Edit: Changed dict to dct as dict is a python built-in name as #Michael suggested.

As your function is named is_isogram() it should return a boolean.
Either the string is an isogram either it isn't.
A big benefit is that you stop iterating as soon as you find a duplicate.
You don't need to use a dict.
It's not a bad idea but to detect an isogram you don't have the need to count occurrences of each letter.
You just have to test the membership.
A set is better suited. Like a dict but without the values.
def is_isogram(word: str) -> bool:
used_letters = set()
for letter in word:
if letter in used_letters:
return False
else:
used_letters.add(letter)
return True
is_isogram("Dermatoglyphics") # True
is_isogram("DDermatoglyphics") # False

Your code works exactly as it is expected it assigns 1 to every letter then since your if condition is out of the loop, it increments the last character (letter) by one.
I made some changes to your code.
def is_isogram(string):
dict = {}
for letter in string:
dict[letter] = 0
for letter in string:
dict[letter] += 1
return dict
print(is_isogram("telegram"))
What I have done is that first it adds all the letters to the dictionary then uses another scan to count each letter.
This function has a complexity of O(n) which is faster than other answers I think
Here is a timed execution of both
This answer: https://onlinegdb.com/lMC-Qn76D
Other answers: https://onlinegdb.com/eeV0IFN5J
Please correct me if I am wrong

Related

Counting number of occurrences in a string

I need to return a dictionary that counts the number of times each letter in a predetermined list occurs. The problem is that I need to count both Upper and Lower case letters as the same, so I can't use .lower or .upper.
So, for example, "This is a Python string" should return {'t':3} if "t" is the letter being searched for.
Here is what I have so far...
def countLetters(fullText, letters):
countDict = {i:0 for i in letters}
lowerString = fullText.lower()
for i in lowerString:
if i in letters:
countDict[i] += 1
return countDict
Where 'letters' is the condition and fullText is the string I am searching.
The obvious issue here is that if the test is "T" rather than "t", my code won't return anything Sorry for any errors in my terminology, I am pretty new to this. Any help would be much appreciated!
To ignore capitalization, you need to input =
input = input.lower ()
.Lists all characters of the input text using list operations.
It can also be used as a word counter if you scan the space character.
input = "Batuq batuq BatuQ" # Reads all inputs up to the EOF character
input = input.replace('-',' ')#Replace (-, + .etc) expressions with a space character.
input = input.replace('.','')
input = input.replace(',','')
input = input.replace("`",'')
input = input.replace("'",'')
#input= input.split(' ') #if you use it, it will sort by the most repetitive words
dictionary = dict()
count = 0
for word in input:
dictionary[word] = input.count(word)
print(dictionary)
#Writes the 5 most repetitive characters
for k in sorted(dictionary,key=dictionary.get,reverse=True)[:5]:
print(k,dictionary[k])
Would something like this work that handles both case sensitive letter counts and non case sensitive counts?
from typing import List
def count_letters(
input_str: str,
letters: List[str],
count_case_sensitive: bool=True
):
"""
count_letters consumes a list of letters and an input string
and returns a dictionary of counts by letter.
"""
if count_case_sensitive is False:
input_str = input_str.lower()
letters = list(set(map(lambda x: x.lower(), letters)))
# dict comprehension - build your dict in one line
# Tutorial on dict comprehensions: https://www.datacamp.com/community/tutorials/python-dictionary-comprehension
counts = {letter: input_str.count(letter) for letter in letters}
return counts
# define function inputs
letters = ['t', 'a', 'z', 'T']
string = 'this is an example with sTrings and zebras and Zoos'
# case sensitive
count_letters(
string,
letters,
count_case_sensitive=True
)
# {'t': 2, 'a': 5, 'z': 1, 'T': 1}
# not case sensitive
count_letters(
string,
letters,
count_case_sensitive=False
)
# {'a': 5, 'z': 2, 't': 3} # notice input T is now just t in dictionary of counts
Try it - like this:
def count_letters(fullText, letters):
countDict = {i: 0 for i in letters}
lowerString = fullText.lower()
for i in lowerString:
if i in letters:
countDict[i] += 1
return countDict
test = "This is a Python string."
print(count_letters(test, 't')) #Output: 3
You're looping over the wrong string. You need to loop over lowerString, not fullString, so you ignore the case when counting.
It's also more efficient to do if i in countDict than if i in letter.
def countLetters(fullText, letters):
countDict = {i.lower():0 for i in letters}
lowerString = fullText.lower()
for i in lowerString:
if i in countDict:
countDict[i] += 1
return countDict
What you can do is simply duplicate the dict with both upper and lowercase like so:
def countLetters(fullText, letters):
countDict = {}
for i in letters:
countDict[i.upper()]=0
countDict[i.lower()]=0
lowerString = fullText.lower()
letters = letters.lower()
for i in lowerString:
if i in letters:
countDict[i] += 1
if (i!=i.upper()):
countDict[i.upper()] +=1
return countDict
print(countLetters("This is a Python string", "TxZY"))
Now some things you can also do is loop over the original string and change countDict[i] += 1 to countDict[i.lower()] +=1
Use the Counter from the collections module
from collections import Counter
input = "Batuq batuq BatuQ"
bow=input.split(' ')
results=Counter(bow)
print(results)
output:
Counter({'Batuq': 1, 'batuq': 1, 'BatuQ': 1})

count number of letters that occur in a string and add the letter and number of times it occurs to a dictionary as a key value pair

I have written a function to count the number of times a letter occurs in a string and return the letter and the number of times it appears in the string as key: value pairs inside a dictionary, the function is to ignore any punctuation.
here is what I have
def count_letters(text):
result = {}
for letter in text:
if letter.isalpha():
if result[letter.lower()]:
result[letter.lower()] += 1
result[letter.lower()] = 1
return(result)
I thought if the key result[letter] doesn't exist it should skip the if statement
if result[letter.lower()]:
but instead it throws a key error, what am I doing wrong?
I am expecting if
text = "This is a sentence."
then my function should return
{'t': 2, 'h': 1, 'i': 2, 's': 3, 'a': 1, 'e': 3, 'n': 2, 'c': 1}
I am trying to do this without importing any modules, its a learning exercise to improve my knowledge of dictionaries.
Try it:
def count_letters(text):
result = {}
for letter in text:
if letter.isalpha():
if letter.lower() in result.keys():
result[letter.lower()] += 1
else:
result[letter.lower()] = 1
return(result)
It's throwing a keyerror because that key doesn't exist yet. Use a try-except instead:
lower = letter.lower() # Only need to do this once
try:
result[lower] += 1
except KeyError:
result[lower] = 1
Also FWIW, the canonical way to do this is with collections.Counter, e.g:
collections.Counter(c.lower() for c in text if c.isalpha())

how to make vowel counter and sum more concise and efficient? Python

I am new to Stack Overflow, but noticed how helpful, and open this community is. Just curious if there is anyway to make this vowel counter more concise/organized. Any help would be appreciated, and an in-depth answer would be awesome as well. Thank you!
def vowel_count(str):
str = input("Please enter a sentence: ")
str1 = (str.lower())
#intialize count variable to zero
count = 0
#create a set of vowels
vowel = set("aeiou")
for alphabet in str1:
if alphabet in vowel:
count = count+1
print("Number of vowels in this sentence is: " , count)
print()
print("A,E,I,O,U")
print(*map(str.lower().count, "aeiou"))
vowel_count(str)
I see that in your code example, you used a variable named str. Don't do that, as str is a built-in function and this can lead to problems.
What about this solution:
string = input().lower()
print(sum([string.count(i) for i in "aeiou"]))
Firstly, I get the input, which I lower immediately.
Then, I used the string.count(i) for every vowel, which in this case returns the amount of times i (one of the vowels) appears in the string (input).
I then called the sum function on the created array which returns the sum of all elements inside the array. Last but not least, I simply printed the value returned from this sum function.
If you don't understand how the argument passed to the sum function is an array, look into the topic of List Comprehension.
I think you should use Counter.
from collections import Counter
In [1]: a = 'I am mayank'
In [5]: ans = Counter(a.lower())
In [6]: ans
Out[6]: Counter({' ': 2, 'a': 3, 'i': 1, 'k': 1, 'm': 2, 'n': 1, 'y': 1})
In [10]: ans['a']
Out[10]: 3
This will count the occurrence of each letter in the string including vowels. This should be quite efficient.

How can I tell what value my function is returning in Python?

I'm trying to debug this program I wrote. How can I tell if, for a given word, hand, and word_list, it returns True or False? I tried initializing a variable failure and then modifying it and printing it's value. It isn't printing, so I don't know if it is behaving like it's supposed to. Any help is appreciated.
I have a function load_words() that returns a list of words. I know word is in word_list (I checked), so just trying to see if word is composed entirely of letters from the keys in the dictionary hand, which in this case it isn't, so it should return False.
Also, what is the difference between .keys() and .iterrkeys(), and is there a better way of looping through hand, perhaps with letter, value in hand.iteritems()?
word = 'axel'
hand2 = {'b':1, 'x':2, 'l':3, 'e':1}
def is_valid_word(word, hand, word_list):
"""
Returns True if word is in the word_list and is entirely
composed of letters in the hand. Otherwise, returns False.
Does not mutate hand or word_list.
word: string
hand: dictionary (string -> int)
word_list: list of lowercase strings
"""
failure = False
if word in word_list:
print hand
print [list(i) for i in word.split('\n')][0]
for letter in [list(i) for i in word.split('\n')][0]:
print letter
if letter in hand.keys():
print letter
return True
failure = True
print failure
else:
return False
failure = False
print failure
else:
return False
failure = False
print failure
is_valid_word(word,hand2,load_words())
UPDATE I wish to use this function in my function, but it gives a key error, even though it works fine on its own.
def update_hand(hand, word):
"""
Assumes that 'hand' has all the letters in word.
In other words, this assumes that however many times
a letter appears in 'word', 'hand' has at least as
many of that letter in it.
Updates the hand: uses up the letters in the given word
and returns the new hand, without those letters in it.
Has no side effects: does not modify hand.
word: string
hand: dictionary (string -> int)
returns: dictionary (string -> int)
"""
for letter in [list(i) for i in word.split('\n')][0]:
if letter in hand.keys():
hand[letter] = hand[letter]-1
if hand[letter] <= 0:
del hand[letter]
display_hand(hand)
return hand
The reason why it is not printing out is because you are returning the function before it prints. This means that the program stops before it reaches the print statement. For example:
def foo(x):
return x
print x
foo("asdf")
Will return nothing while:
def foo(x):
print x
return x
foo("asdf")
Will print:
asdf
So, all your statements before return. If not, it will not execute.
For your second clarification, this post already has your answer https://stackoverflow.com/a/3617008:
In Python 2, iter(d.keys()) and d.iterkeys() are not quite equivalent, although they will behave the same. In the first, keys() will return a copy of the dictionary's list of keys and iter will then return an iterator object over this list, with the second a copy of the full list of keys is never built.
Note that Python 3 does not have .iterkeys() too. Python 3 uses the previous .iterkeys() as the new .keys().
Lastly, I will review what is generally wrong with your code and what you want to achieve in descending order of severity.
Your code only checks one letter
[list(i) for i in word.split('\n')][0] is not how you get all the letters from a word.
You should make short code return first so that you would not have big indent blocks.
Your code only checks one letter
In your for loop, you return True immediately after the first word is checked. You should return True after the loop is completed instead.
for letter in word:
if letter not in hand.keys():
return False
return True
List comprehension
Your list comprehension is not needed (I'll tell you why later) and need not be so complex just to get the letters from a word. E.g.
[list(i) for i in word.split('\n')][0]
Actually only does this:
list(word)
In fact, you should just iterate through the word directly (as I did above), it will return the letters one by one:
for letter in word:
# code...
Make short code return first
Usually I dislike big chunks of highly indented code. What you can do is make the short code return first. For example:
if word in word_list:
for letter in word:
if letter in hand.keys():
return True
else:
return False
else:
return False
Can be simply be written as:
if word not in word_list:
return False
for letter in word:
if letter in hand.keys():
return True
else:
return False
However, this is just my opinion. Some others may prefer the else statement so that they know when the code is executed.
Your final code would look like:
def is_valid_word(word, hand, word_list):
if word not in word_list:
return False
for letter in word:
if letter not in hand.keys():
return False
return True
Clean right? However, I assume that you are making something like a scrabble game, so you would count if the words in your hand can for the word you chose. What you can add is something to count if the number of letters in the word is less than or equal to the number of letters in your hand:
def is_valid_word(word, hand, word_list):
if word not in word_list:
return False
# This makes the word into a "unique list"
letters = set(word)
for letter in letters:
if hand[letter] < word.count(letter):
return False
return True
EDIT
There was a problem with the code. It does not check if the letter is in hand in the if statement: if hand[letter] < word.count(letter):.
def is_valid_word(word, hand, word_list):
if word not in word_list and word not in hand.keys():
return False
letters = set(word)
for letter in letters:
# Add this extra clause
if letter in hand.keys() or hand[letter] < word.count(letter):
return False
return True
You can print the result directly print is_valid_word(word,hand2,load_words())
You have some indentation issues, and doing something after a return statement is futile.
You don't need to use keys or iterkeys the in operator will check for you, and will work with lists, set, dicts (keys), tuples, strings, ...
The in operator invokes __contains__ which is supported by most python collections.
Also have look at https://docs.python.org/2/reference/expressions.html#membership-test-details.
He is a minimized example of what you want to do with 3 tests.
def is_valid_word(word, hand, word_list):
"""
Returns True if word is in the word_list and is entirely composed
of letters in the hand. Otherwise, returns False. Does not mutate
hand or word_list.
word: string
hand: dictionary (string -> int)
word_list: list of lowercase strings
"""
if word not in word_list:
return False
for letter in word:
if letter not in hand:
return False
return True
print(is_valid_word('bxel',
{'b': 1, 'x': 2, 'l': 3, 'e': 1},
['foo', 'bar', 'bxel']))
print(is_valid_word('axel',
{'b': 1, 'x': 2, 'l': 3, 'e': 1},
['foo', 'bar', 'axel']))
print(is_valid_word('axel',
{'a': 1, 'x': 2, 'l': 3, 'e': 1},
['foo', 'bar', 'axel']))

How can I 'translate' elements from a 'list' to those in another 'list' in Python?

Alright, I did a bit of researching before I came here, and I didn't find anything that catered especially to what I wanted to find.
The problem is a little difficult, (at least, for me) so I'll do my best to explain it here.
Ok: So I've got a list of letters, that are taken from a string a user enters.
(They enter a string, then it's all broken down into individual letters and added to a list, where each letter is its own 'element')
Now I wanted to do this because I wanted to have all the letters inside this list to be translated into something else, depending on what I had 'set' each letter to be translated to.
I'm a bit confused as to how to do this, I could have a whole while loop that did the following:
Takes the first element, then runs the whole alphabet and numbers 1-9 by it. If it matched anything, it adds it to a new list.
The trouble is, it seems really inefficient to do this, and I am thinking their must surely be a better way to do this, I'll post what I am talking about below:
I can't get the code formatting to work correctly, and I'm just frustrated with it. Here's the code:
print("\t\t\tThe fun encryptor")
print("\n\n Please Enter the following string to be encrypted")
STRING = input("Entry: ")
STRINGCOPY = [STRING]
DIRECTORY = []
#The string is to be encyrypted.
STRINGLEN = len(STRING)
OPPLIMIT = 0
REPEAT = False
DIRECTORYT = []
while OPPLIMIT < STRINGLEN:
DIRECTORY = DIRECTORY + str.split(STRING[OPPLIMIT])
OPPLIMIT += 1
# String Added to the Directory necessary
if "a" in DIRECTORY[0]:
DIRECTORYT += [0.01]
elif "b" in DIRECTORY[0]:
DIRECTORYT += [0.11]
elif "c" in DIRECTORY:
DIRECTORYT += [1.11]
#and so on and so forth
a="0.01"
b="0.11"
c="1.11"
d="0.02"
e="0.22"
f="2.22"
g="0.03"
h="0.33"
i="3.33"
j="0.04"
k="0.44"
l="4.44"
m="0.05"
n="0.55"
o="5.55"
p="0.06"
q="0.66"
r="6.66"
s="0.07"
t="0.77"
u="7.77"
v="0.08"
w="0.88"
x="8.88"
y="0.09"
z="0.99"
As you can see, it seems almost pointless to go through this all, is there an easier way to do it?
Perhaps using the for function thing?
First, instead of creating 26 separate variables named a, b, etc., just create a dict:
values = {'a': 0.01,
'b': 0.11',
#...
}
Now, you can just do this:
for letter in DIRECTORY[0]:
DIRECTORYT += [values[letter]]
Or, alternatively:
for letter in values:
if letter in DIRECTORY[0]:
DIRECTORYT += [values[letter]]
The difference between the two lies in how they handle duplicates. And I'm not sure which one you want (or, if you never have any dups, so it doesn't matter). Try executing both with different sample data until you get the idea.
As a side note, it's generally better to do:
DIRECTORYT.append(values[letter])
than:
DIRECTORYT += [values[letter]]
In other words, don't create a list if you don't need one.
But in this case, you may be able to replace the whole loop with a list comprehension or generator expression:
DIRECTORYT += [values[letter] for letter in DIRECTORY[0]]
DIRECTORYT += [values[letter] for letter in values if letter in DIRECTORY[0]]
or:
DIRECTORYT.extend(values[letter] for letter in DIRECTORY[0])
DIRECTORYT.extend(values[letter] for letter in values if letter in DIRECTORY[0])
The advantage of the extend/generator expression versions is, again, that they don't build temporary lists.
Not really sure what you are trying to do, but here are two approaches.
You could set up a dictionary with the letters and their translations, then do a lookup on this dictionary when your user enters a word.
codes = {}
codes['a'] = 9
codes['b'] = 133
# and so on
codes['z'] = 1
user_input = raw_input('Please enter a string: ')
translated_stuff = []
for letter in user_input:
if letter in codes:
translated_stuff.append(codes[letter])
# a shorter way to do the above loop is
# translated_stuff = [codes[i] for i in user_input if i in codes]
print "Your translated stuff is : {}".format(''.join(translated_stuff))
Or, if you want that simply check if an input matches a set:
import string
match_list = string.letters + string.digits
user_input = raw_input('Please enter a string: ')
matched_stuff = [i for i in user_input if i in match_list]
It's unclear to me everything you were trying to accomplish in your code, but here how to translate each the characters of the string entered into the values shown in your code. There's no need to convert to first convert it to a list because a string is already a sequence of letters, which can be treated just like items in a list in most cases (the main difference being that they can't be changed).
It uses a Python dictionary (or dict) to hold the translation because that allows it to be used to lookup each letter and find the value associated with it very quickly, usually much faster than searching for it by checking every entry in a regular table or list.
It also makes sure only alphabetic characters (letters) are translated and converts uppercase ones to lowercase for the look-up operation since the dictionary only has entries for them in it.
import string
translate = {
'a': "0.01", 'b': "0.11", 'c': "1.11", 'd': "0.02", 'e': "0.22", 'f': "2.22",
'g': "0.03", 'h': "0.33", 'i': "3.33", 'j': "0.04", 'k': "0.44", 'l': "4.44",
'm': "0.05", 'n': "0.55", 'o': "5.55", 'p': "0.06", 'q': "0.66", 'r': "6.66",
's': "0.07", 't': "0.77", 'u': "7.77", 'v': "0.08", 'w': "0.88", 'x': "8.88",
'y': "0.09", 'z': "0.99",
}
print("\t\t\tThe fun encryptor")
print("\n\nPlease type in a string to be encrypted and press the Enter key")
entry = input("Entry: ")
directory = [translate[ch.lower()] for ch in entry if ch in string.ascii_letters]
print('directory:', directory)

Categories