Finding anagrams in Python - python

I solved this problem the following way in Python:
s1,s2 = raw_input().split()
set1 = set(s1)
set2 = set(s2)
diff = len(set1.intersection(s2))
if(diff == 0)
print "Anagram!"
else:
print "Not Anagram!"
It seemed fine to me. But my professor's program said I'm missing some edge cases. Can you think of any edge cases I might have missed?

The correct way to solve this would be to count the number of characters in both the strings and comparing each of them to see if all the characters are the same and their counts are the same.
Python has a collections.Counter to do this job for you. So, you can simply do
from collections import Counter
if Counter(s1) == Counter(s2):
print "Anagram!"
else:
print "Not Anagram!"
If you don't want to use Counter, you can roll your own version of it, with normal dictionaries and then compare them.
def get_frequency(input_string):
result = {}
for char in input_string:
result[char] = result.get(char, 0) + 1
return result
if get_frequency(s1) == get_frequency(s2):
print "Anagram!"
else:
print "Not Anagram!"

use sorted :
>>> def using_sorted(s1,s2):
... return sorted(s1)==sorted(s2)
...
>>> using_sorted("hello","llho")
False
>>> using_sorted("hello","llhoe")
True
you can also use count:
>>> def using_count(s1,s2):
... if len(s1)==len(s2):
... for x in s1:
... if s1.count(x)!=s2.count(x):
... return False
... return True
... else: return False
...
>>> using_count("abb","ab")
False
>>> using_count("abb","bab")
True
>>> using_count("hello","llohe")
True
>>> using_count("hello","llohe")
sorted solution runs in O(n lg n) complexity and the count solution runs in O(n ^ 2) complexity, whereas the Counter solution in runs in O(N).
Note collections.Counter is better to use
check #fourtheye solution

Another way without sorting considering all are alphabets:
>>> def anagram(s1, s2):
... return sum([ord(x)**2 for x in s1]) == sum([ord(x)**2 for x in s2])
...
>>> anagram('ark', 'day')
False
>>> anagram('abcdef', 'bdefa')
False
>>> anagram('abcdef', 'bcdefa')
True
>>>

Don't do it with set Theory:
Code:
a='aaab'
b='aab'
def anagram(a,b):
setA=list(a)
setB=list(b)
print setA, setB
if len(setA) !=len(setB):
print "no anagram"
diff1 =''.join(sorted(setA))
diff2= ''.join(sorted(setB))
if (diff1 == diff2 ):
print "matched"
else:
print "Mismatched"
anagram(a,b)

the anagram check with two strings
def anagrams (s1, s2):
# the sorted strings are checked
if(sorted(s1.lower())== sorted(s2.lower())):
return True
else:
return False

check anagram in one liner
def anagram_checker(str1, str2):
"""
Check if the input strings are anagrams of each other
Args:
str1(string),str2(string): Strings to be checked
Returns:
bool: Indicates whether strings are anagrams
"""
return sorted(str1.replace(" ", "").lower()) == sorted(str2.replace(" ", "").lower())
print(anagram_checker(" XYZ","z y x"))

Related

How can I write a function that verifies if a list is a non-contiguous sublist of another list, in python? [duplicate]

If I have string needle and I want to check if it exists contiguously as a substring in haystack, I can use:
if needle in haystack:
...
What can I use in the case of a non-continuous subsequence? Example:
>>> haystack = "abcde12345"
>>> needle1 = "ace13"
>>> needle2 = "123abc"
>>> is_subsequence(needle1, haystack)
True
>>> is_subsequence(needle2, haystack) # order is important!
False
I don't know if there's builtin function, but it is rather simple to do manually
def exists(a, b):
"""checks if b exists in a as a subsequence"""
pos = 0
for ch in a:
if pos < len(b) and ch == b[pos]:
pos += 1
return pos == len(b)
>>> exists("moo", "mo")
True
>>> exists("moo", "oo")
True
>>> exists("moo", "ooo")
False
>>> exists("haystack", "hack")
True
>>> exists("haystack", "hach")
False
>>>
Using an iterator trick:
it = iter(haystack)
all(x in it for x in needle)
This is only a concise version of the same idea presented in tobias_k's answer.
Another possibility: You can create iterators for both, needle and haystack, and then pop elements from the haystack-iterator until either all the characters in the needle are found, or the iterator is exhausted.
def is_in(needle, haystack):
try:
iterator = iter(haystack)
for char in needle:
while next(iterator) != char:
pass
return True
except StopIteration:
return False
We can try simple for loop and break method and pass on substring once the match is found
def substr(lstr,sstr):
lenl = len(lstr)
for i in sstr:
for j in range(lenl):
if i not in lstr:
return False
elif i == lstr[j]:
lstr = lstr[j+1:]
break
else:
pass
return True

Isomorphic Strings (the easy way?)

So I have been trying to solve the Easy questions on Leetcode and so far I dont understand most of the answers I find on the internet. I tried working on the Isomorphic strings problem (here:https://leetcode.com/problems/isomorphic-strings/description/)
and I came up with the following code
def isIso(a,b):
if(len(a) != len(b)):
return false
x=[a.count(char1) for char1 in a]
y=[b.count(char1) for char1 in b]
return x==y
string1 = input("Input string1..")
string2 = input("Input string2..")
print(isIso(string1,string2))
Now I understand that this may be the most stupid code you have seen all day but that is kinda my point. I'd like to know why this would be wrong(and where) and how I should further develop on this.
If I understand the problem correctly, because a character can map to itself, it's just a case of seeing if the character counts for the two words are the same.
So egg and add are isomorphic as they have character counts of (1,2). Similarly paper and title have counts of (1,1,1,2).
foo and bar aren't isomorphic as the counts are (1,2) and (1,1,1) respectively.
To see if the character counts are the same we'll need to sort them.
So:
from collections import Counter
def is_isomorphic(a,b):
a_counts = list(Counter(a).values())
a_counts.sort()
b_counts = list(Counter(b).values())
b_counts.sort()
if a_counts == b_counts:
return True
return False
Your code is failing because here:
x=[a.count(char1) for char1 in a]
You count the occurrence of each character in the string for each character in the string. So a word like 'odd' won't have counts of (1,2), it'll have (1,2,2) as you count d twice!
You can use two dicts to keep track of the mapping of each character in a to b, and the mapping of each character in b to a while you iterate through a, and if there's any violation in a corresponding character, return False; otherwise return True in the end.
def isIso(a, b):
m = {} # mapping of each character in a to b
r = {} # mapping of each character in b to a
for i, c in enumerate(a):
if c in m:
if b[i] != m[c]:
return False
else:
m[c] = b[i]
if b[i] in r:
if c != r[b[i]]:
return False
else:
r[b[i]] = c
return True
So that:
print(isIso('egg', 'add'))
print(isIso('foo', 'bar'))
print(isIso('paper', 'title'))
print(isIso('paper', 'tttle')) # to test reverse mapping
would output:
True
False
True
False
I tried by creating a dictionary, and it resulted in 72ms runtime.
here's my code -
def isIsomorphic(s: str, t: str) -> bool:
my_dict = {}
if len(s) != len(t):
return False
else:
for i in range(len(s)):
if s[i] in my_dict.keys():
if my_dict[s[i]] == t[i]:
pass
else:
return False
else:
if t[i] in my_dict.values():
return False
else:
my_dict[s[i]] = t[i]
return True
There are many different ways on how to do it. Below I provided three different ways by using a dictionary, set, and string.translate.
Here I provided three different ways how to solve Isomorphic String solution in Python.
from itertools import zip_longest
def isomorph(a, b):
return len(set(a)) == len(set(b)) == len(set(zip_longest(a, b)))
here is the second way to do it:
def isomorph(a, b):
return [a.index(x) for x in a] == [b.index(y) for y in b]

python Program to check whether two words are anagrams

I have written a small program to check whether two words are anagrams. If two words are anagrams it should return "true" otherwise it should return "False", but I am not getting the correct output. Kindly, tell me what is the mistake in the below program.
def anagram(s1,s2):
for x in s1:
if (x in s2) and (s2.count(x)==s1.count(x)):
pass
return(True)
else:
return(False)
You are iterating over a word and then not doing anything (with pass, the empty statement). Then you return True unconditionally, leaving no chance to return False.
Instead, you can simply sort the two words and then see if they end up the same:
def anagram(s1, s2):
return sorted(s1) == sorted(s2)
Please format this in a more readable way. However, it looks like you are calling return True inside the loop, meaning that if any character occurs the same amount of times in each string, your function will return True
Try this:
def anagram(s1,s2):
for x in s1:
if ( x in s2 ) and (s2.count(x) == s1.count(x) ):
pass
else:
return False
for x in s2:
if ( x in s1 ) and (s1.count(x) == s2.count(x) ):
pass
else:
return False
return True
You may try this:
>>> def anagram(str1,str2):
s1=str1.lower()
s2=str2.lower()
a=len(s1)
b=len(s2)
if a==b:
for c in s1:
if c in s2:
for s in s2:
if s in s1:
return True
else:
return False
else:
return False
You were really close. Your indentation was bad but that is probably due to the text formatting here in SO.
The mistake in your code was that you were returning True too soon. What you have to do is to go through all letters and check for existance and count. Below you can find a corrected and somewhat optimised version of what you were trying to do.
def anagram(s1, s2):
if set(s1) == set(s2) and all(s2.count(x) == s1.count(x) for x in set(s1)):
return True
return False
But then again #Tigerhawk's solution is way better.

How to check whether two words are anagrams python

I'm dealing with a simple problem:
checking if two strings are anagrams.
I wrote the simple code that can check whether two strings, such as
'abcd' and 'dcba', are anagrams, but I have no idea what to do with more complex ones, like "Astronomer" and "Moon starter."
line1 = input('Enter the first word: ')
line2 = input('Enter the second word: ')
def deleteSpaces(s):
s_new = s.replace(" ","")
return s_new
def anagramSolution2(s1,s2):
alist1 = list(deleteSpaces(s1))
alist2 = list(deleteSpaces(s2))
print(alist1)
print(alist2)
alist1.sort()
alist2.sort()
pos = 0
matches = True
while pos < len(deleteSpaces(s1)) and matches:
if alist1[pos]==alist2[pos]:
pos = pos + 1
else:
matches = False
return matches
Firstly I thought that the problem lies in working with spaces, but then I understood that my algorithm doesn't work if the strings are not the same size.
I have no idea what to do in that case.
Here I found a beautiful solution, but it doesn't work either:
def anagrams(s1,s2):
return [False, True][sum([ord(x) for x in s1]) == sum([ord(x) for x in s2])]
If I run this function and test it on two strings, I'll get such output:
Examples:
First Word: apple
Second Word: pleap
output: True
First Word: Moon starter
Second Word: Astronomer
output: False //however it should should be True because this words are anagrams
Your algorithm is ok. Your problem is that you don't consider upper and lower case letters. Changing the two lines
alist1 = list(deleteSpaces(s1))
alist2 = list(deleteSpaces(s2))
to
alist1 = list(deleteSpaces(s1).lower())
alist2 = list(deleteSpaces(s2).lower())
will solve your issue.
As an alternative you could simply use the following function:
def anagrams(s1, s2):
def sort(s):
return sorted(s.replace(" ", "").lower())
return sort(s1) == sort(s2)
If you want to have a faster solution with complexity of O(n), you should use a Counter instead of sorting the two words:
from collections import Counter
def anagrams(s1, s2):
def get_counter(s):
return Counter(s.replace(" ", "").lower())
return get_counter(s1) == get_counter(s2)
As other's have pointed out, your algorithm is given 'false' results as Moon starter and Astronomer are in fact not anagrams.
You can drastically improve your algorithm by simply using the available object methods. They already provide all the functionality.
def normalize_str(s):
return s.replace(" ","").lower()
def anagramSolution2(s1,s2):
return sorted(normalize_str(s1)) == sorted(normalize_str(s2))
normalize_str is like your deleteSpaces, but it also converts everything to lowercase. This way, Moon and moon will compare equal. You may want to do stricter or looser normalization in the end, it's just an example.
The call to sorted will already provide you with a list, you don't have to do an explicit conversion. Also, list comparison via == will compare the lists element wise (what you are doing with the for loop), but much more efficiently.
Something like this?
$ cat /tmp/tmp1.py
#!/usr/bin/env python
def anagram (first, second):
return sorted(first.lower()) == sorted(second.lower())
if __name__ == "__main__":
for first, second in [("abcd rsh", "abcd x rsh"), ("123 456 789", "918273645 ")]:
print("is anagram('{0}', '{1}')? {2}".format(first, second, anagram(first, second)))
which gives:
$ python3 /tmp/tmp1.py
is anagram('abcd rsh', 'abcd x rsh')? False
is anagram('123 456 789', '918273645 ')? True
a=input("string1:");
b=input("string2:");
def anagram(a,b):
arra=split(a)
arrb=split(b)
arra.sort()
arrb.sort()
if (len(arra)==len(arrb)):
if(arra==arrb):
print ("True")
else:
ana=0;
print ("False");
else:
print ("False");
def split(x):
x=x.replace(' ','').lower()
temp=[]
for i in x:
temp.append(i)
return temp;
anagram(a,b)
Lazy way of doing it but super clean:
def anagram(a,b):
return cleanString(a) == cleanString(b)
def cleanString(string):
return ''.join(sorted(string)).lower().split()
def anagram(str1,str2):
l1=list(str1)
l2=list(str2)
if sorted(l1) == sorted(l2):
print("yess")
else:
print("noo")
str1='listen'
str2='silent'
anagram(str1,str2)
although not the most optimized solution but works fine.
I found this the best way:
from collections import Counter
def is_anagram(string1, string2):
return Counter(string1) == Counter(string2)
I tried this and it worked
def is_anagram(word1,word2):
a = len(word1)
b = len(word2)
if a == b:
for l in word1:
if l in word2:
return True
return False

How can I return false if more than one number while ignoring "0"'s?

This is a function in a greater a program that solves a sudoku puzzle. At this point, I would like the function to return false if there is more then 1 occurrence of a number unless the number is zero. What do am I missing to achieve this?
L is a list of numbers
l =[1,0,0,2,3,0,0,8,0]
def alldifferent1D(l):
for i in range(len(l)):
if l.count(l[i])>1 and l[i] != 0: #does this do it?
return False
return True
Assuming the list is length 9, you can ignore the inefficiency of using count here (Using a helper datastructure - Counter etc probably takes longer than running .count() a few times). You can write the expression to say they are all different more naturally as:
def alldifferent1D(L):
return all(L.count(x) <= 1 for x in L if x != 0)
This also saves calling count() for all the 0's
>>> from collections import counter
>>> def all_different(xs):
... return len(set(Counter(filter(None, xs)).values()) - set([1])) == 0
Tests:
>>> all_different([])
True
>>> all_different([0,0,0])
True
>>> all_different([0,0,1,2,3])
True
>>> all_different([1])
True
>>> all_different([1,2])
True
>>> all_different([0,2,0,1,2,3])
False
>>> all_different([2,2])
False
>>> all_different([1,2,3,2,2,3])
False
So we can break this down into two problems:
Getting rid of the zeros, since we don't care about them.
Checking if there are any duplicate numbers.
Striping the zeros is easy enough:
filter(lambda a: a != 0, x)
And we can check for differences in a set (which has only one of each element) and a list
if len(x) == len(set(x)):
return True
return False
Making these into functions we have:
def remove_zeros(x):
return filter(lambda a: a != 0, x)
def duplicates(x):
if len(x) == len(set(x)):
return True
return False
def alldifferent1D(x):
return duplicates(remove_zeros(x))
One way to avoid searching for every entry in every position is to:
flags = (len(l)+1)*[False];
for cell in l:
if cell>0:
if flags[cell]:
return False
flags[cell] = True
return True
The flags list has a True at index k if the value k has been seen before in the list.
I'm sure you could speed this up with list comprehension and an all() or any() test, but this worked well enough for me.
PS: The first intro didn't survive my edit, but this is from a Sudoku solver I wrote years ago. (Python 2.4 or 2.5 iirc)

Categories