Isomorphic Strings (the easy way?) - python

So I have been trying to solve the Easy questions on Leetcode and so far I dont understand most of the answers I find on the internet. I tried working on the Isomorphic strings problem (here:https://leetcode.com/problems/isomorphic-strings/description/)
and I came up with the following code
def isIso(a,b):
if(len(a) != len(b)):
return false
x=[a.count(char1) for char1 in a]
y=[b.count(char1) for char1 in b]
return x==y
string1 = input("Input string1..")
string2 = input("Input string2..")
print(isIso(string1,string2))
Now I understand that this may be the most stupid code you have seen all day but that is kinda my point. I'd like to know why this would be wrong(and where) and how I should further develop on this.

If I understand the problem correctly, because a character can map to itself, it's just a case of seeing if the character counts for the two words are the same.
So egg and add are isomorphic as they have character counts of (1,2). Similarly paper and title have counts of (1,1,1,2).
foo and bar aren't isomorphic as the counts are (1,2) and (1,1,1) respectively.
To see if the character counts are the same we'll need to sort them.
So:
from collections import Counter
def is_isomorphic(a,b):
a_counts = list(Counter(a).values())
a_counts.sort()
b_counts = list(Counter(b).values())
b_counts.sort()
if a_counts == b_counts:
return True
return False
Your code is failing because here:
x=[a.count(char1) for char1 in a]
You count the occurrence of each character in the string for each character in the string. So a word like 'odd' won't have counts of (1,2), it'll have (1,2,2) as you count d twice!

You can use two dicts to keep track of the mapping of each character in a to b, and the mapping of each character in b to a while you iterate through a, and if there's any violation in a corresponding character, return False; otherwise return True in the end.
def isIso(a, b):
m = {} # mapping of each character in a to b
r = {} # mapping of each character in b to a
for i, c in enumerate(a):
if c in m:
if b[i] != m[c]:
return False
else:
m[c] = b[i]
if b[i] in r:
if c != r[b[i]]:
return False
else:
r[b[i]] = c
return True
So that:
print(isIso('egg', 'add'))
print(isIso('foo', 'bar'))
print(isIso('paper', 'title'))
print(isIso('paper', 'tttle')) # to test reverse mapping
would output:
True
False
True
False

I tried by creating a dictionary, and it resulted in 72ms runtime.
here's my code -
def isIsomorphic(s: str, t: str) -> bool:
my_dict = {}
if len(s) != len(t):
return False
else:
for i in range(len(s)):
if s[i] in my_dict.keys():
if my_dict[s[i]] == t[i]:
pass
else:
return False
else:
if t[i] in my_dict.values():
return False
else:
my_dict[s[i]] = t[i]
return True

There are many different ways on how to do it. Below I provided three different ways by using a dictionary, set, and string.translate.
Here I provided three different ways how to solve Isomorphic String solution in Python.

from itertools import zip_longest
def isomorph(a, b):
return len(set(a)) == len(set(b)) == len(set(zip_longest(a, b)))
here is the second way to do it:
def isomorph(a, b):
return [a.index(x) for x in a] == [b.index(y) for y in b]

Related

Checking interweaving strings in pythonic way

I have a problem for checking the interweaving strings. So, I have 3 strings and I need to check whether the third string can be formed by interweaving the first two strings.
To simplify, interweaving strings means to merge them by alternating their letters without any specific pattern.
For example: "abc" and "123" are two strings and possible interweaving can be "abc123" or "a1b2c3" or "ab1c23" etc mixing them without any pattern but following order sequence.
another example, for false case:
str1 = "aabcc",
str2 = "dbbca",
str3 = "aadbbbaccc" # This is false case, where str3 is not interweaving from str1 and str2
what i have tried:
I have tried below solution which is working for me. returning True if string c is interweaving of string a and b else returning False
def weavingstring(a,b,c):
for i in c:
if len(a)>0:
if i == a[0]:
a=a[1:]
if len(b)>0:
if i == b[0]:
b=b[1:]
if len(a) > 0 or len(b) > 0:
return False
return True
print(weavingstring("abc", "def", "abcdef")) # True here
print(weavingstring("aabcc", "dbbca", "aadbbbaccc")) # False here
What i want
I have the normal solution, I wanted to know if there is any pythonic way of doing this by using list comprehension, map, filter, reduce or lambda functions etc.
One approach of using the lru_cache with lambda to speed things up:
from functools import lru_cache
class Solution:
def isInterleave(self, s1: str, s2: str, s3: str) -> bool:
return (memo:=lru_cache()
( lambda i=0, j=0, k=0:
(i, j, k) == (len(s1), len(s2), len(s3))
or ( i < len(s1) and k < len(s3) and s1[i] == s3[k] and memo( i+1, j, k+1 ) )
or ( j < len(s2) and k < len(s3) and s2[j] == s3[k] and memo( i, j+1, k+1 ) ) )
)()
May I interest you in a simple recursive solution?
def weaving(a: str, b: str, res: str) -> bool:
if not a: return b == res
if not b: return a == res
if not res: return False
if res[0] == a[0]:
return weaving(a[1:], b, res[1:])
elif res[0] == b[0]:
return weaving(a, b[1:], res[1:])
else:
return False
A simple solution would be to use regexes.
import re
def iswoven(first, second, test):
return all((re.search(r'.*'.join(x), test) for x in (first, second)))
>>> iswoven('abc', '123', 'a1b2c23')
True
>>> iswoven('abc', '123', 'a1b223')
False
This will check that the test string matches both 'a.*b.*c' and '1.*2.*3' and return the result.
Depending on the length of your strings, this could end up trying to match on some pretty hairy regexes, so buyer beware.

How to check if a sequence in a tuple is part of a tuple? python [duplicate]

If I have this:
a='abcdefghij'
b='de'
Then this finds b in a:
b in a => True
Is there a way of doing an similar thing with lists?
Like this:
a=list('abcdefghij')
b=list('de')
b in a => False
The 'False' result is understandable - because its rightly looking for an element 'de', rather than (what I happen to want it to do) 'd' followed by 'e'
This is works, I know:
a=['a', 'b', 'c', ['d', 'e'], 'f', 'g', 'h']
b=list('de')
b in a => True
I can crunch the data to get what I want - but is there a short Pythonic way of doing this?
To clarify: I need to preserve ordering here (b=['e','d'], should return False).
And if it helps, what I have is a list of lists: these lists represents all possible paths (a list of visited nodes) from node-1 to node-x in a directed graph: I want to 'factor' out common paths in any longer paths. (So looking for all irreducible 'atomic' paths which constituent all the longer paths).
Related
Best Way To Determine if a Sequence is in another sequence in Python
I suspect there are more pythonic ways of doing it, but at least it gets the job done:
l=list('abcdefgh')
pat=list('de')
print pat in l # Returns False
print any(l[i:i+len(pat)]==pat for i in xrange(len(l)-len(pat)+1))
Don't know if this is very pythonic, but I would do it in this way:
def is_sublist(a, b):
if not a: return True
if not b: return False
return b[:len(a)] == a or is_sublist(a, b[1:])
Shorter solution is offered in this discussion, but it suffers from the same problems as solutions with set - it doesn't consider order of elements.
UPDATE:
Inspired by MAK I introduced more concise and clear version of my code.
UPDATE:
There are performance concerns about this method, due to list copying in slices. Also, as it is recursive, you can encounter recursion limit for long lists. To eliminate copying, you can use Numpy slices which creates views, not copies. If you encounter performance or recursion limit issues you should use solution without recursion.
I think this will be faster - It uses C implementation list.index to search for the first element, and goes from there on.
def find_sublist(sub, bigger):
if not bigger:
return -1
if not sub:
return 0
first, rest = sub[0], sub[1:]
pos = 0
try:
while True:
pos = bigger.index(first, pos) + 1
if not rest or bigger[pos:pos+len(rest)] == rest:
return pos
except ValueError:
return -1
data = list('abcdfghdesdkflksdkeeddefaksda')
print find_sublist(list('def'), data)
Note that this returns the position of the sublist in the list, not just True or False. If you want just a bool you could use this:
def is_sublist(sub, bigger):
return find_sublist(sub, bigger) >= 0
I timed the accepted solution, my earlier solution and a new one with an index. The one with the index is clearly best.
EDIT: I timed nosklo's solution, it's even much better than what I came up with. :)
def is_sublist_index(a, b):
if not a:
return True
index = 0
for elem in b:
if elem == a[index]:
index += 1
if index == len(a):
return True
elif elem == a[0]:
index = 1
else:
index = 0
return False
def is_sublist(a, b):
return str(a)[1:-1] in str(b)[1:-1]
def is_sublist_copylist(a, b):
if a == []: return True
if b == []: return False
return b[:len(a)] == a or is_sublist_copylist(a, b[1:])
from timeit import Timer
print Timer('is_sublist([99999], range(100000))', setup='from __main__ import is_sublist').timeit(number=100)
print Timer('is_sublist_copylist([99999], range(100000))', setup='from __main__ import is_sublist_copylist').timeit(number=100)
print Timer('is_sublist_index([99999], range(100000))', setup='from __main__ import is_sublist_index').timeit(number=100)
print Timer('sublist_nosklo([99999], range(100000))', setup='from __main__ import sublist_nosklo').timeit(number=100)
Output in seconds:
4.51677298546
4.5824368
1.87861895561
0.357429027557
So, if you aren't concerned about the order the subset appears, you can do:
a=list('abcdefghij')
b=list('de')
set(b).issubset(set(a))
True
Edit after you clarify: If you need to preserve order, and the list is indeed characters as in your question, you can use:
''.join(a).find(''.join(b)) > 0
This should work with whatever couple of lists, preserving the order.
Is checking if b is a sub list of a
def is_sublist(b,a):
if len(b) > len(a):
return False
if a == b:
return True
i = 0
while i <= len(a) - len(b):
if a[i] == b[0]:
flag = True
j = 1
while i+j < len(a) and j < len(b):
if a[i+j] != b[j]:
flag = False
j += 1
if flag:
return True
i += 1
return False
>>>''.join(b) in ''.join(a)
True
Not sure how complex your application is, but for pattern matching in lists, pyparsing is very smart and easy to use.
Use the lists' string representation and remove the square braces. :)
def is_sublist(a, b):
return str(a)[1:-1] in str(b)
EDIT: Right, there are false positives ... e.g. is_sublist([1], [11]). Crappy answer. :)

Basic Anagram VS more advanced one

I'm currently working my way through a "List" unit and in one of the exercises we need to create an anagram (for those who don't know; two words are an anagram if you can rearrange the letters from one to spell the other).
The easier solution that comes to mind is:
def is_anagram(a, b):
return sorted(a) == sorted(b)
print is_anagram('god', 'dog')
It works, but it doesn't really satisfy me. If we were in this situation for example:
def is_anagram(a, b):
return sorted(a) == sorted(b)
print is_anagram('god', 'dyog') #extra letter 'd' in second argument
>>> False
Return is False, although we should be able to to build the word 'god' out of 'dyog'. Perhaps this game/problem isn't called an anagram, but I've been trying to figure it out anyway.
Technically my solution is to:
1- Iterate through every element of b.
2- As I do, I check if that element exists in a.
3- If all of them do; then we can create a out of b.
4- Otherwise, we can't.
I just can't seem to get it to work. For the record, I do not know how to use lambda
Some tests:
print is_anagram('god', 'ddog') #should return True
print is_anagram('god', '9d8s0og') #should return True
print is_anagram('god', '_#10d_o g') #should return True
Thanks :)
Since the other answers do not, at time of writing, support reversing the order:
Contains type hints for convenience, because why not?
# Just strip hints out if you're in Python < 3.5.
def is_anagram(a: str, b: str) -> bool:
long, short = (a, b) if len(a) > len(b) else (b, a)
cut = [x for x in long if x in short]
return sorted(cut) == sorted(short)
If, in future, you do learn to use lambda, an equivalent is:
# Again, strip hints out if you're in Python < 3.5.
def is_anagram(a: str, b: str) -> bool:
long, short = (a, b) if len(a) > len(b) else (b, a)
# Parentheses around lambda are not necessary, but may help readability
cut = filter((lambda x: x in short), long)
return sorted(cut) == sorted(short)
If you need to check if a word could be created from b you can do this
def is_anagram(a,b):
b_list = list(b)
for i_a in a:
if i_a in b_list:
b_list.remove(i_a)
else:
return False
return True
UPDATE(Explanation)
b_list = list(b) makes a list of str objects(characters).
>>> b = 'asdfqwer'
>>> b_list = list(b)
>>> b_list
['a', 's', 'd', 'f', 'q', 'w', 'e', 'r']
What basically is happening in the answer: We check if every character in a listed in b_list, when occurrence happens we remove that character from b_list(we do that to eliminate possibility of returning True with input 'good', 'god'). So if there is no occurrence of another character of a in rest of b_list then it's not advanced anagram.
def is_anagram(a, b):
test = sorted(a) == sorted(b)
testset = b in a
testset1 = a in b
if testset == True:
return True
if testset1 == True:
return True
else:
return False
No better than the other solution. But if you like verbose code, here's mine.
Try that:
def is_anagram(a, b):
word = [filter(lambda x: x in a, sub) for sub in b]
return ''.join(word)[0:len(a)]
Example:
>>> is_anagram('some','somexcv')
'some'
Note: This code returns the common word if there is any or '' otherwise. (it doesn't return True/False as the OP asked - realized that after, but anyway this code can easily change to adapt the True/False type of result - I won't fix it now such there are great answers in this post that already do :) )
Brief explanation:
This code takes(filter) every letter(sub) on b word that exists in a word also. Finally returns the word which must have the same length as a([0:len(a)]). The last part with len needs for cases like this:
is_anagram('some','somenothing')
I think all this sorting is a red herring; the letters should be counted instead.
def is_anagram(a, b):
return all(a.count(c) <= b.count(c) for c in set(a))
If you want efficiency, you can construct a dictionary that counts all the letters in b and decrements these counts for each letter in a. If any count is below 0, there are not enough instances of the character in b to create a. This is an O(a + b) algorithm.
from collections import defaultdict
def is_anagram(a, b):
b_dict = defaultdict(int)
for char in b:
b_dict[char] += 1
for char in a:
b_dict[char] -= 1
if b_dict[char] < 0:
return False
return True
Iterating over b is inevitable since you need to count all the letters there. This will exit as soon as a character in a is not found, so I don't think you can improve this algorithm much.

Understanding isomorphic strings algorithm

I am understanding the following code to find if the strings are isomorphic or not. The code uses two hashes s_dict and t_dict respectively. I am assuming the strings are of same length.
def isIsomorphic(s, t):
s_dict = {}
t_dict = {}
for i in range(len(s)):
if s[i] in s_dict.keys() and s_dict[s[i]] != t[i]:
return False
if t[i] in t_dict.keys() and t_dict[t[i]] != s[i]:
return False
s_dict[s[i]] = t[i]
t_dict[t[i]] = s[i]
return True
Now, if I modify the above code such that only one hash s_dict() is used, then also it gives desired results to my limited test cases. The modified code is as follows:
def isIsomorphic(s, t):
s_dict = {}
for i in range(len(s)):
if s[i] in s_dict.keys() and s_dict[s[i]] != t[i]:
return False
s_dict[s[i]] = t[i]
return True
What are the test cases in which the above modified code will fail? Is my understanding of the isomorphic strings wrong?
One simple example, your code doesn't work on s='ab',t='aa'.
Basically you have to have both way to be isomorphic. Your code only checked that t can be modified from s, but not the other way around.
This was kind of fun to look at. Just for kicks, here's my solution using itertools.groupby
from itertools import groupby
from collections import defaultdict
def get_idx_count(word):
"""Turns a word into a series of tuples (idx, consecutive_chars)
"aabbccaa" -> [[(0, 2), (3, 2)], [(1, 2)], [(2, 2)]]
"""
lst = defaultdict(list)
for idx, (grp, val) in enumerate(groupby(word)):
lst[grp].append((idx, sum(1 for _ in val)))
return sorted(list(lst.values()))
def is_isomorphic(a, b):
return get_idx_count(a) == get_idx_count(b)
is_isomorphic('aabbcc', 'bbddcc') # True
is_isomorphic('aabbaa', 'bbddcc') # False
Rather than building the lists, I feel like I could do something more like:
from itertools import groupby
from collections import defaultdict
def is_isomorphic(a, b):
a_idxs, b_idxs = defaultdict(set), defaultdict(set)
for idx, ((a_grp, a_vals), (b_grp, b_vals)) in enumerate(zip(groupby(a), groupby(b))):
if sum(1 for _ in a_vals) != sum(1 for _ in b_vals):
return False
# ensure sequence is of same length
if a_grp in a_idxs and b_idxs[b_grp] != a_idxs[a_grp] or\
b_grp in b_idxs and a_idxs[a_grp] != b_idxs[b_grp]:
return False
# ensure previous occurrences are matching groups
a_idxs[a_grp].add(idx)
b_idxs[b_grp].add(idx)
# save indexes for future checks
return True
But I haven't had a chance to test that whatsoever. It does, however, have the likelihood of exiting early, rather than having to build all the lists and comparing them at the end. It also skips a few sets of sorting that shouldn't be necessary anyway, except that we're pulling from dicts, so bleh. Pretty sure list.sort is faster than saving to to a collections.OrderedDict. I think.
Here is a solution for Isomorphic String problem in Python:
Solution #1 (the fastest one):
def isometric_strings_translate(str1: str, str2: str) -> bool:
trans = str.maketrans(str1, str2)
return str1.translate(trans) == str2
Solution #2 (using set):
def isometric_strings_set(str1: str, str2: str) -> bool:
return len(set(zip(str1, str2))) == len(set(str1))
Solution #3 (using dictionary):
def isometric_strings_dict(str1: str, str2: str) -> bool:
map = {}
for s1, s2 in zip(str1, str2):
if map.setdefault(s1, s2) != s2:
return False
return True
Here you can view full code.
Thinking about it without maps or dictionaries can help you understand:
def isIso(x,y):
if len(x) != len(y):
return False
for i in range(len(x)):
count = 0
if x.count(x[i]) != y.count(y[i]):
return False
return True

How can I return false if more than one number while ignoring "0"'s?

This is a function in a greater a program that solves a sudoku puzzle. At this point, I would like the function to return false if there is more then 1 occurrence of a number unless the number is zero. What do am I missing to achieve this?
L is a list of numbers
l =[1,0,0,2,3,0,0,8,0]
def alldifferent1D(l):
for i in range(len(l)):
if l.count(l[i])>1 and l[i] != 0: #does this do it?
return False
return True
Assuming the list is length 9, you can ignore the inefficiency of using count here (Using a helper datastructure - Counter etc probably takes longer than running .count() a few times). You can write the expression to say they are all different more naturally as:
def alldifferent1D(L):
return all(L.count(x) <= 1 for x in L if x != 0)
This also saves calling count() for all the 0's
>>> from collections import counter
>>> def all_different(xs):
... return len(set(Counter(filter(None, xs)).values()) - set([1])) == 0
Tests:
>>> all_different([])
True
>>> all_different([0,0,0])
True
>>> all_different([0,0,1,2,3])
True
>>> all_different([1])
True
>>> all_different([1,2])
True
>>> all_different([0,2,0,1,2,3])
False
>>> all_different([2,2])
False
>>> all_different([1,2,3,2,2,3])
False
So we can break this down into two problems:
Getting rid of the zeros, since we don't care about them.
Checking if there are any duplicate numbers.
Striping the zeros is easy enough:
filter(lambda a: a != 0, x)
And we can check for differences in a set (which has only one of each element) and a list
if len(x) == len(set(x)):
return True
return False
Making these into functions we have:
def remove_zeros(x):
return filter(lambda a: a != 0, x)
def duplicates(x):
if len(x) == len(set(x)):
return True
return False
def alldifferent1D(x):
return duplicates(remove_zeros(x))
One way to avoid searching for every entry in every position is to:
flags = (len(l)+1)*[False];
for cell in l:
if cell>0:
if flags[cell]:
return False
flags[cell] = True
return True
The flags list has a True at index k if the value k has been seen before in the list.
I'm sure you could speed this up with list comprehension and an all() or any() test, but this worked well enough for me.
PS: The first intro didn't survive my edit, but this is from a Sudoku solver I wrote years ago. (Python 2.4 or 2.5 iirc)

Categories