find index of first occurrence of str from given str - python

Google or Amazone ask the following question in an interview, would my solution be accepted?
problem: find the index of the first occurrence of the given word from the given string
note: Above problem is from a website and following code passed all the test cases. however, I am not sure if this is the most optimum solutions and so would be accepted by big giants.
def strStr(A, B):
if len(A) == 0 or len(B) == 0:
return -1
for i in range(len(A)):
c = A[i:i+len(B)]
if c == B:
return i
else:
return -1

There are a few algorithms that you can learn on this topic like
rabin karp algorithm , z algorithm ,kmpalgorithm
which all run in run time complexity of O(n+m) where n is the string length and m is the pattern length. Your algorithm runs in O(n*m) runtime complexity . I would suggest starting to learn from rabin karp algorithm, I personally found it the easiest to grasp.
There are also some advanced topics like searching many patterns in one string like the aho-corasick algorithm which is good to read. I think this is what grep uses when searching for multiple patterns.
Hope it helps :)

Python actually has a built in function for this, which is why this question doesn't seem like a great fit for interviews in python. Something like this would suffice:
def strStr(A, B):
return A.find(B)
Otherwise, as commenters have mentioned, inputs/outputs and tests are important. You could add some checks that make it slightly more performant (i.e. check that B is smaller than A), but I think in general, you won't do better than O(n).

If you want to match the entire word to the words in the string, your code would not work.
E.g If my arguments are print(strStr('world hello world', 'wor')), your code would return 0, but it should return -1.

I checked your function, works well in python3.6
print(strStr('abcdef', 'bcd')) # with your function. *index start from 0
print("adbcdef".find('bcd')) # python default function. *index start from 1

first occurrence index, use index() or find()
text = 'hello i am homer simpson'
index = text.index('homer')
print(index)
index = text.find('homer')
print(index)
output:
11
11

It is always better to got for the builtin python funtions.
But sometimes in the interviews they will ask for you to implemente it yourself. The best thing to do is to start with the simplest version, then think about corner cases and improvements.
Here you have a test with your version, a slightly improved one that avoid to reallocating new strings in each index and the python built-ing:
A = "aaa foo baz fooz bar aaa"
B = "bar"
def strInStr1(A, B):
if len(A) == 0 or len(B) == 0:
return -1
for i in range(len(A)):
c = A[i:i+len(B)]
if c == B:
return i
else:
return -1
def strInStr2(A, B):
size = len(B)
for i in range(len(A)):
if A[i] == B[0]:
if A[i:i+size] == B:
return i
return -1
def strInStr3(A, B):
return A.index(B)
import timeit
setup = '''from __main__ import strInStr1, strInStr2, strInStr3, A, B'''
for f in ("strInStr1", "strInStr2", "strInStr3"):
result = timeit.timeit(f"{f}(A, B)", setup=setup)
print(f"{f}: ", result)
The results speak for themselves (time in seconds):
strInStr1: 15.809420814999612
strInStr2: 7.687011377005547
strInStr3: 0.8342400040055509
Here you have the live version

Related

Rewriting recursive algorithm to memoized algorithm

I have written the following recursive algorithm:
p = [2,3,2,1,4]
def fn(c,i):
if(c < 0 or i < 0):
return 0
if(c == 0):
return 1
return fn(c,i-1)+fn(c-p[i-1],i-1)
Its a solution to a problem where you have c coins, and you have to find out have many ways you can spend your c coins on beers. There are n different beers, only one of each beer.
i is denoted as the i'th beer, with the price of p[i], the prices are stored in array p.
The algorithm recursively calls itself, and if c == 0, it returns 1, as it has found a valid permutation. If c or i is less than 0, it returns 0 as it's not a valid permutation, as it exceeds the amount of coins available.
Now I need to rewrite the algorithm as a Memoized algorithm. This is my first time trying this, so I'm a little confused on how to do it.
Ive been trying different stuff, my latest try is the following code:
p = [2,3,2,1,4]
prev = np.empty([5, 5])
def fni(c,i):
if(prev[c][i] != None):
return prev[c][i]
if(c < 0 or i < 0):
prev[c][i] = 0
return 0
if(c == 0):
prev[c][i] = 1
return 1
prev[c][i] = fni(c,i-1)+fni(c-p[i-1],i-1)
return prev[c][i]
"Surprisingly" it doesn't work, and im sure it's completely wrong. My thought was to save the results of the recursive call in an 2d array of 5x5, and check in the start if the result is already saved in the array, and if it is just return it.
I only provided my above attempt to show something, so don't take the code too seriously.
My prev array is all 0's, and should be values of null so just ignore that.
My task is actually only to solve it as pseudocode, but I thought it would be easier to write it as code to make sure that it would actually work, so pseudo code would help as well.
I hope I have provided enough information, else feel free to ask!
EDIT: I forgot to mention that I have 5 coins, and 5 different beers (one of each beer). So c = 5, and i = 5
First, np.empty() by default gives an array of uninitialized values, not Nones, as the documentation points out:
>>> np.empty([2, 2])
array([[ -9.74499359e+001, 6.69583040e-309],
[ 2.13182611e-314, 3.06959433e-309]]) #uninitialized
Secondly, although this is more subjective, you should default to using dictionaries for memoization in Python. Arrays may be more efficient if you know you'll actually memoize most of the possible values, but it can be hard to tell that ahead of time. At the very least, make sure your array values are initialized. It's good that you're using numpy-- that will help you avoid the common beginner mistake of writing memo = [[0]*5]*5.
Thirdly, you should perform checks for 'out of bounds' or negative parameters (c < 0 or i < 0) before you use them to access an array as in prev[c][i] != None. Negative indices in Python could map you to a different memoized parameter's value.
Besides those details, your memoization code and strategy is sound.

How do I not repeat my list comprehension in a lambda

I am trying to be unnecessarily fancy with a challenge from codesignal. The problem: "Given a number and a range, find the largest integer within the given range that's divisible by the given number."
I have l for left boundary, r for right boundary, and d for the divisor. If none of the numbers within the boundary are divisible, then the function must return a -1. Otherwise, return the largest divisible number.
Is there a way to avoid repeating the list comprehension?
Is there a better way to do this altogether? (that is equally unreadable and unnecessary of course)
These receive a NameError: name '_' is not defined, which makes sense.
maxDivisor = lambda l,r,d: _[0] if [i for i in range(l,r+1)[::-1] if i%d==0] else -1
maxDivisor = lambda l,r,d: [i for i in range(l,r+1)[::-1] if i%d==0][0] if _ else -1
This works, but I don't want to repeat myself:
maxDivisor = lambda l,r,d: [i for i in range(l,r+1)[::-1] if i%d==0][0] if [i for i in range(l,r+1)[::-1] if i%d==0] else -1
This works, but is too readable:
def maxDivisor(left, right, divisor):
for i in range(left,right+1)[::-1]:
if i%divisor ==0:
return i
return -1
Just to reiterate:
maxDivisor(-99,-96,5) should return -1 and
maxDivisor(1,10,3) should return 9.
Thank you for your help with my unnecessary request.
Do not write bad and unreadable code just for the sake of writing bad and unreadable code.1)
Instead, I'd suggest using max with a generator expression and a default, which does, and reads, exactly what you want: Get the max number in this range which is a divisor, or -1 if no such thing exists.
res = max((x for x in range(l, r+1) if x%d==0), default=-1)
Similar, but maybe closer in spirit to what you were trying, you could use next on the filtered reversed range to get the largest such element, or -1 as default.
res = next((x for x in range(r, l-1, -1) if x%d==0), -1)
If you really want to be "fancy", though, how about this: Instead of testing all the numbers, just get the result directly in O(1):
res = r - (r % d) if (r - (r % d) >= l) else -1
(All of the parens are unnecessary here, but IMHO make it more readable, so this even fulfills part of your requirement.)
From your comment, it seems like you are trying "Code Golf", where the goal is to have the shortest code possible. In this case, you might go with the third approach, but use this variant without the ternary ... if ... else .... This should also fully qualify for your "unnecessary and unreadable" requirement:
x=[r-r%d,-1][r-r%d<l] # for code-golf only!
I will not tell you how it works, though, you have to find this out for yourself.
1) Unless this is some sort of obfuscated-code-challenge, maybe.

Python - Finding Index Location Function

What is the complexity of the following function??
def find_set(string, chars):
schars = set(chars)
for i, c in enumerate(string):
if c in schars:
return i
return -1
print(find_set("Happy birthday", "py"))
In this instance, a 1 is returned, since H is at index 1 of CHEERIO.
Is it possible to further optimize this function?
Your (worst case) time complexity is O(len(string) * len(set)). Yes you can do better (at least from an algorithms perspective).
def find_set(string, chars):
schars = set(chars)
return next((i for i, c in enumerate(string) if c in schars), -1)
This should execute in O(len(chars) + len(string)) (worst case). Of course, when it comes to "optimization", usually you should forget what you think you know and profile. Just because mine has better algorithmic complexity doesn't mean that it will perform better on your real world data.

How many combinations are possible?

The recursive formula for computing the number of ways of choosing k items out of a set of n items, denoted C(n,k), is:
1 if K = 0
C(n,k) = { 0 if n<k
c(n-1,k-1)+c(n-1,k) otherwise
I’m trying to write a recursive function C that computes C(n,k) using this recursive formula. The code I have written should work according to myself but it doesn’t give me the correct answers.
This is my code:
def combinations(n,k):
# base case
if k ==0:
return 1
elif n<k:
return 0
# recursive case
else:
return combinations(n-1,k-1)+ combinations(n-1,k)
The answers should look like this:
>>> c(2, 1)
0
>>> c(1, 2)
2
>>> c(2, 5)
10
but I get other numbers... don’t see where the problem is in my code.
I would try reversing the arguments, because as written n < k.
I think you mean this:
>>> c(2, 1)
2
>>> c(5, 2)
10
Your calls, e.g. c(2, 5) means that n=2 and k=5 (as per your definition of c at the top of your question). So n < k and as such the result should be 0. And that’s exactly what happens with your implementation. And all other examples do yield the actually correct results as well.
Are you sure that the arguments of your example test cases have the correct order? Because they are all c(k, n)-calls. So either those calls are wrong, or the order in your definition of c is off.
This is one of those times where you really shouldn't be using a recursive function. Computing combinations is very simple to do directly. For some things, like a factorial function, using recursion there is no big deal, because it can be optimized with tail-recursion anyway.
Here's the reason why:
Why do we never use this definition for the Fibonacci sequence when we are writing a program?
def fibbonacci(idx):
if(idx < 2):
return idx
else:
return fibbonacci(idx-1) + fibbonacci(idx-2)
The reason is because that, because of recursion, it is prohibitively slow. Multiple separate recursive calls should be avoided if possible, for the same reason.
If you do insist on using recursion, I would recommend reading this page first. A better recursive implementation will require only one recursive call each time. Rosetta code seems to have some pretty good recursive implementations as well.

Logical precedence in Python

I have a question about python precedence. I have the following code:
def gcdIter(a, b):
ans = min(a,b)
while ((a%ans is not 0) and (b%ans is not 0)):
ans -= 1
return ans
My question is about the while logical statement. I added several parenthesis just to make sure that the expression would be evaluated the way I was thinking, but is not. The while loop is being breaked before the both expressions are true. Were I'm wrong?
I found a way to do the same thing without using two expressions, in:
def gcdIter(a, b):
ans = min(a,b)
while ((a%ans + b%ans is not 0)) :
ans -= 1
return ans
But I still wanna know why the first code isn't running the way I think it should.
Do not use identity testing (is or is not) to test for numerical equality. Use == or != instead.
while a%ans != 0 and b%ans != 0:
is tests for object identity (that both operators are the same python object), which is not the same thing as testing if the values are equivalent.
Since 0 is also considered False in a boolean context, you can even omit the != in this case:
while a % ans and b % ans:
The fractions module already has a gcd() function that implements the greatest common divisor algorithm correctly:
from fractions import gcd
print gcd(a, b)
It uses the Euclidian algorithm, python style:
def gcd(a, b):
"""Calculate the Greatest Common Divisor of a and b.
Unless b==0, the result will have the same sign as b (so that when
b is divided by it, the result comes out positive).
"""
while b:
a, b = b, a%b
return a

Categories