Python function with two strings - sub-anagram - python

I'm wanting to define a function with two strings that takes two arguments. I'm wanting this to then return true if the first string is a 'sub-anagram' of the second string. The function should only return true if every letter that's in the first string appears at least as many times in the second string.
eg. key is a 'sub-anagram' of keyboard but mouse isn't.
Here's my code so far:
# -*- coding: utf-8 -*-
def anagram(str1,str2):
# string to list
str1 = list(str1.lower())
str2 = list(str2.lower())
#sort list
str1.sort()
str2.sort()
#join list back to string
str1 = ''.join(str1)
str2 = ''.join(str2)
return str1 == str2
print(anagram('trainers', 'strainer'))
So far it will return true if both strings are exact anagrams and I am not sure how to change it.
Thankyou

As #achampion mentioned, Counter is the best way to go about this. To check if string a has all the characters to make string b:
from collections import Counter
def contains_anagram(a, b):
a = Counter(a)
b = Counter(b)
return all(b[letter] <= a[letter] for letter in b)

Related

How would I detect duplicate elements of a string from another string in python?

So how would I go about finding a duplicate element of a string from another string in python using a for the most part one-to-two line or a quick fix?
for example,
str1 = "abccde"
str2 = "abcde"
# gets me c
Through the use of str2, finding there was a duplicate element in str1, so detecting that str1 has a duplicate of an element in str2. Not sure if there's a way through .count to do that, like str1.count(str2) or something.
I'm using this contextually for my hangman assignment and I'm a beginner coder, so we are using mostly built-in functions and the basics for the assignments, and there's a piece of my code within my loop that will keep printing because it dings the double letters.
Ex. hello, grinding, concoction.
So I pretty much made a "used" string, and I am trying to compare that to my correct letters list, and the guesses are 'appended' so I can avoid that.
note: they will be inputted, so I won't be able to say or just hardcode the letter c if that makes sense.
Thank you!
Using set with str.count:
def find_dup(str1, str2):
return [i for i in set(str1) if str1.count(i) > 1 and i in set(str2)]
Output:
find_dup("abccde", "abcde")
# ['c']
find_dup("abcdeffghi" , "aaaaaabbbbbbcccccddeeeeefffffggghhiii") # from comment
# ['f']
My guess is that maybe you're trying to write a method similar to:
def duplicate_string(str1: str, str2: str) -> str:
str2_set = set(str2)
if len(str2_set) != len(str2):
raise ValueError(f'{str2} has duplicate!')
output = ''
for char in str1:
if char in str2_set:
str2_set.remove(char)
else:
output += char
return output
str1 = "abccccde"
str2 = "abcde"
print(duplicate_string(str1, str2))
Output
ccc
Here, we would first raise an error, if str2 itself had a duplicate. Then, we'd loop through str1, either remove a char from the str1_set or append the duplicate in an output string.
You are basically searching a diff function between the two strings. Adapting this beautiful answer
import difflib
cases=[('abcccde', 'abcde')]
for a,b in cases:
print('{} => {}'.format(a,b))
for i,s in enumerate(difflib.ndiff(a, b)):
if s[0]==' ': continue
elif s[0]=='-':
print(u'The second string is missing the "{}" in position {} of the first string'.format(s[-1],i))
elif s[0]=='+':
print(u'The first string is missing the "{}" in position {} of the second string'.format(s[-1],i))
print()
Output
abcccde => abcde
The second string is missing the "c" in position 3 of the first string
The second string is missing the "c" in position 4 of the first string

How to delete repeating letters in a string?

I am trying to write a function which will return me the string of unique characters present in the passed string. Here's my code:
def repeating_letters(given_string):
counts = {}
for char in given_string:
if char in counts:
return char
else:
counts[char] = 1
if counts[char] > 1:
del(char)
else:
return char
I am not getting expected results with it. How can I get the desired result.
Here when I am passing this string as input:
sample_input = "abcadb"
I am expecting the result to be:
"abcd"
However my code is returning me just:
nothing
def repeating_letters(given_string):
seen = set()
ret = []
for c in given_string:
if c not in seen:
ret.append(c)
seen.add(c)
return ''.join(ret)
Here we add each letter to the set seen the first time we see it, at the same time adding it to a list ret. Then we return the joined list.
Here's the one-liner to achieve this if the order in the resultant string matters via using set with sorted as:
>>> my_str = 'abcadbgeg'
>>> ''.join(sorted(set(my_str),key=my_str.index))
'abcdge'
Here sorted will sort the characters in the set based on the first index of each in the original string, resulting in ordered list of characters.
However if the order in the resultant string doesn't matter, then you may simply do:
>>> ''.join(set(my_str))
'acbedg'

How to check if string is a pangram?

I want to create a function which takes a string as input and check whether the string is pangram or not (pangram is a piece of text which contains every letter of the alphabet).
I wrote the following code, which works, but I am looking for an alternative way to do it, hopefully a shorted way.
import string
def is_pangram (gram):
gram = gram.lower()
gram_list_old = sorted([c for c in gram if c != ' '])
gram_list = []
for c in gram_list_old:
if c not in gram_list:
gram_list.append(c)
if gram_list == list(string.ascii_lowercase): return True
else: return False
I feel like this question might be against the rules of this website but hopefully it isn't. I am just curious and would like to see alternative ways to do this.
is_pangram = lambda s: not set('abcdefghijklmnopqrstuvwxyz') - set(s.lower())
>>> is_pangram('abc')
False
>>> is_pangram('the quick brown fox jumps over the lazy dog')
True
>>> is_pangram('Does the quick brown fox jump over the lazy dog?')
True
>>> is_pangram('Do big jackdaws love my sphinx of quartz?')
True
Test string s is a pangram if we start with the alphabet, remove every letter found in the test string, and all the alphabet letters get removed.
Explanation
The use of 'lambda' is a way of creating a function, so it's a one line equivalent to writing a def like:
def is_pangram(s):
return not set('abcdefghijklmnopqrstuvwxyz') - set(s.lower())
set() creates a data structure which can't have any duplicates in it, and here:
The first set is the (English) alphabet letters, in lowercase
The second set is the characters from the test string, also in lowercase. And all the duplicates are gone as well.
Subtracting things like set(..) - set(..) returns the contents of the first set, minus the contents of the second set. set('abcde') - set('ace') == set('bd').
In this pangram test:
we take the characters in the test string away from the alphabet
If there's nothing left, then the test string contained all the letters of the alphabet and must be a pangram.
If there's something leftover, then the test string did not contain all the alphabet letters, so it must not be a pangram.
any spaces, punctuation characters from the test string set were never in the alphabet set, so they don't matter.
set(..) - set(..) will return an empty set, or a set with content. If we force sets into the simplest True/False values in Python, then containers with content are 'True' and empty containers are 'False'.
So we're using not to check "is there anything leftover?" by forcing the result into a True/False value, depending on whether there's any leftovers or not.
not also changes True -> False, and False -> True. Which is useful here, because (alphabet used up) -> an empty set which is False, but we want is_pangram to return True in that case. And vice-versa, (alphabet has some leftovers) -> a set of letters which is True, but we want is_pangram to return False for that.
Then return that True/False result.
is_pangram = lambda s: not set('abcdefghijklmnopqrstuvwxyz') - set(s.lower())
# Test string `s`
#is a pangram if
# the alphabet letters
# minus
# the test string letters
# has NO leftovers
You can use something as simple as:
import string
is_pangram = lambda s: all(c in s.lower() for c in string.ascii_lowercase)
Sets are excellent for membership testing:
>>> import string
>>> candidate = 'ammdjri * itouwpo ql ? k # finvmcxzkasjdhgfytuiopqowit'
>>> ascii_lower = set(string.ascii_lowercase)
Strip the whitespace and punctuation from the candidate then test:
>>> candidate_lower = ascii_lower.intersection(candidate.lower())
>>> ascii_lower == candidate_lower
False
Find out what is missing:
>>> ascii_lower.symmetric_difference(candidate_lower)
set(['b', 'e'])
Try it again but add the missing letters:
>>> candidate = candidate + 'be'
>>> candidate_lower = ascii_lower.intersection(candidate.lower())
>>> ascii_lower == candidate_lower
True
>>>
def pangram(word):
return all(chr(c+97) in word for c in range(25))
How about simply check whether each one of the lowercased alphabet is in the sentence:
text = input()
s = set(text.lower())
if sum(1 for c in s if 96 < ord(c) < 123) == 26:
print ('pangram')
else:
print ('not pangram')
or in a function:
def ispangram(text):
return sum(1 for c in set(text.lower()) if 96 < ord(c) < 123) == 26
Here is another definition:
def is_pangram(s):
return len(set(s.lower().replace(" ", ""))) == 26
I came up with the easiest and without using module programe.
def checking(str_word):
b=[]
for i in str_word:
if i not in b:
b.append(i)
b.sort()
#print(b)
#print(len(set(b)))
if(len(set(b))>=26):
print(b)
print(len(b))
print(" String is pangram .")
else:
print(" String isn't pangram .")
#b.sort()
#print(b)
str_word=input(" Enter the String :")
checking(str_word)
I see this thread is a little old, but I thought I'd throw in my solution anyway.
import string
def panagram(phrase):
new_phrase=sorted(phrase.lower())
phrase_letters = ""
for index in new_phrase:
for letter in string.ascii_lowercase:
if index == letter and index not in phrase_letters:
phrase_letters+=letter
print len(phrase_letters) == 26
or for the last line:
print phrase_letters == string.ascii_lowercase
def panagram(phrase):
alphabet="abcdefghiklmnopqrstuvwxyz"
pharseletter=""
for char in phrase:
if char in aphabet:
phraseletter= phraseletter + char
for char in aplhabet:
if char not in phrase:
return false
import string
def ispangram(str, alphabet=string.ascii_lowercase):
alphabet = set(alphabet)
return alphabet <= set(str.lower())
or more simpler way
def ispangram(str):
return len(set(str.lower().replace(" ", ""))) == 26
import string
def is_pangram(phrase, alpha=string.ascii_lowercase):
num = len(alpha)
count=0
for i in alpha:
if i in phrase:
count += 1
return count == num
def panagram(str1):
str1=str1.replace(' ','').lower()
s=set(str1)
l=list(s)
if len(l)==26:
return True
return False
str1='The quick brown fox jumps over the dog'
q=panagram(str1)
print(q)
True
import string
def ispangram(str1,alphabet=string.ascii.lowercase):
for myalphabet in alphabet:
if myalphabet not in str1:
print(it's not pangram)
break
else:
print(it's pangram)
Execute the command:
ispangram("The quick brown fox jumps over the lazy dog")
Output: "it's pangram."
Hint: string.ascii_lowercase returns output
abcdefghijklmnopqrstuvwxyz
import java.io.*;
import java.util.*;
import java.text.*;
import java.math.*;
import java.util.regex.*;
public class Solution {
public static void main(String[] args) {
String s;
char f;
Scanner in = new Scanner(System.in);
s = in.nextLine();
char[] charArray = s.toLowerCase().toCharArray();
final Set set = new HashSet();
for (char a : charArray) {
if ((int) a >= 97 && (int) a <= 122) {
f = a;
set.add(f);
}
}
if (set.size() == 26){
System.out.println("pangram");
}
else {
System.out.println("not pangram");
}
}
}
import string
import re
list_lower= list(string.lowercase);
list_upper=list(string.uppercase);
list_total=list_lower + list_upper ;
def is_panagram(temp):
for each in temp:
if each not in list_total :
return 'true'
sample=raw_input("entre the string\n");
string2=re.sub('[^A-Za-z0-9]+', '', sample)
ram=is_panagram(string2);
if ram =='true':
print "sentence is not a panagram"
else:`enter code here`
print"sentece is a panagram"

Persistent index in python string

I'm trying to get string.index() to ignore instances of a character that it has already located within a string. Here is my best attempt:
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
def save_alphabet(phrase):
saved_alphabet = ""
for item in phrase:
if item in alphabet:
saved_alphabet = saved_alphabet + str(phrase.index(item))
return saved_alphabet
print save_alphabet("aAaEaaUA")
The output I'd like is "1367" but, as it only finds the first instance of item it is outputting "1361".
What's the best way to do this? The returned value should be in string format.
>>> from string import ascii_uppercase as alphabet
>>> "".join([str(i) for i, c in enumerate("aAaEaaUA") if c in alphabet])
'1367'
regex solution (do not prefer regex in this case)
>>> import re
>>> "".join([str(m.start()) for m in re.finditer(r'[A-Z]', "aAaEaaUA")])
'1367'

Python: find most frequent bytes?

I'm looking for a (preferably simple) way to find and order the most common bytes in a python stream element.
e.g.
>>> freq_bytes(b'hello world')
b'lohe wrd'
or even
>>> freq_bytes(b'hello world')
[108,111,104,101,32,119,114,100]
I currently have a function that returns a list in the form list[97] == occurrences of "a". I need that to be sorted.
I figure I basically need to flip the list so list[a] = b --> list[b] = a at the same time removing the repeates.
Try the Counter class in the collections module.
from collections import Counter
string = "hello world"
print ''.join(char[0] for char in Counter(string).most_common())
Note you need Python 2.7 or later.
Edit: Forgot the most_common() method returned a list of value/count tuples, and used a list comprehension to get just the values.
def frequent_bytes(aStr):
d = {}
for char in aStr:
d[char] = d.setdefault(char, 0) + 1
myList = []
for char, frequency in d.items():
myList.append((frequency, char))
myList.sort(reverse=True)
return ''.join(myList)
>>> frequent_bytes('hello world')
'lowrhed '
I just tried something obvious. #kindall's answer rocks, though. :)

Categories