Binary search on array with duplicate

Binary search on array with duplicate - python

First time posting here, so apologies in advance if I am not following best practices. My algorithm is supposed to do the following in a sorted array with possible duplicates.
Return -1 if the element does not exist in the array
Return the smallest index where the element is present.
I have written a binary search algorithm for an array without duplicate. This returns a position of the element or -1. Based on blackbox testing, I know that the non-duplicate version of the binary search works. I have then recursively called that function via another function to search from 0 to position-1 to find the first incidence of the element, if any.
I am currently failing a black box test. I am getting a wrong answer error and not a time out error. I have tried most of the corner cases that I could think of and also ran a brute force test with the naive search algorithm and could not find an issue.
I am looking for some guidance on what might be wrong in the implementation rather than an alternate solution.
The format is as follow:
Input:
5 #array size
3 4 7 7 8 #array elements need to be sorted
5 #search query array size
3 7 2 8 4 #query elements
Output
0 2 -1 4 1
My code is shown below:
class BinarySearch:
def __init__(self,input_list,query):
self.array=input_list
self.length=len(input_list)
self.query=query
return
def binary_search(self,low,high):
'''
Implementing the binary search algorithm with distinct numbers on a
sorted input.
'''
#trivial case
if (self.query<self.array[low]) or (self.query>self.array[high-1]):
return -1
elif (low>=high-1) and self.array[low]!=self.query:
return -1
else:
m=low+int(np.floor((high-low)/2))
if self.array[low]==self.query:
return low
elif (self.array[m-1]>=self.query):
return self.binary_search(low,m)
elif self.array[high-1]==self.query:
return high-1
else:
return self.binary_search(m,high)
return
class DuplicateBinarySearch(BinarySearch):
def __init__(self,input_list,query):
BinarySearch.__init__(self,input_list,query)
def handle_duplicate(self,position):
'''
Function handles the duplicate number problem.
Input: position where query is identified.
Output: updated earlier position if it exists else return
original position.
'''
if position==-1:
return -1
elif position==0:
return 0
elif self.array[position-1]!=self.query:
return position
else:
new_position=self.binary_search(0,position)
if new_position==-1 or new_position>=position:
return position
else:
return self.handle_duplicate(new_position)
def naive_duplicate(self,position):
old_position=position
if position==-1:
return -1
else:
while position>=0 and self.array[position]==self.query:
position-=1
if position==-1:
return old_position
else:
return position+1
if __name__ == '__main__':
num_keys = int(input())
input_keys = list(map(int, input().split()))
assert len(input_keys) == num_keys
num_queries = int(input())
input_queries = list(map(int, input().split()))
assert len(input_queries) == num_queries
for q in input_queries:
item=DuplicateBinarySearch(input_keys,q)
#res=item.handle_duplicate(item.binary_search(0,item.length))
#res=item.naive_duplicate(item.binary_search(0,item.length))
#assert res_check==res
print(item.handle_duplicate(item.binary_search(0,item.length)), end=' ')
#print(item.naive_duplicate(item.binary_search(0,item.length)), end=' ')
When I run a naive duplicate algorithm, I get a time out error:
Failed case #56/57: time limit exceeded (Time used: 10.00/5.00, memory used: 42201088/536870912.)
When I run the binary search with duplicate algorithm, I get a wrong answer error on a different test case:
Failed case #24/57: Wrong answer
(Time used: 0.11/5.00, memory used: 42106880/536870912.)
The problem statement is as follows:
Problem Statement
Update:
I could make the code work by making the following change but I have not been able to create a test case to see why the code would fail in the first case.
Original binary search function that works with no duplicates but fails an unknown edge case when a handle_duplicate function calls it recursively. I changed the binary search function to the following:
def binary_search(self,low,high):
'''
Implementing the binary search algorithm with distinct numbers on a sorted input.
'''
#trivial case
if (low>=high-1) and self.array[low]!=self.query:
return -1
elif (self.query<self.array[low]) or (self.query>self.array[high-1]):
return -1
else:
m=low+(high-low)//2
if self.array[low]==self.query:
return low
elif (self.array[m-1]>=self.query):
return self.binary_search(low,m)
elif self.array[m]<=self.query:
return self.binary_search(m,high)
elif self.array[high-1]==self.query:
return high-1
else:
return -1

Since you are going to implement binary search with recursive, i would suggest you add a variable 'result' which act as returning value and hold intermediate index which equal to target value.
Here is an example:
def binarySearchRecursive(nums, left, right, target, result):
"""
This is your exit point.
If the target is not found, result will be -1 since it won't change from initial value.
If the target is found, result will be the index of the first occurrence of the target.
"""
if left > right:
return result
# Overflow prevention
mid = left + (right - left) // 2
if nums[mid] == target:
# We are not sure if this is the first occurrence of the target.
# So we will store the index to the result now, and keep checking.
result = mid
# Since we are looking for "first occurrence", we discard right half.
return binarySearchRecursive(nums, left, mid - 1, target, result)
elif target < nums[mid]:
return binarySearchRecursive(nums, left, mid - 1, target, result)
else:
return binarySearchRecursive(nums, mid + 1, right, target, result)
if __name__ == '__main__':
nums = [2,4,4,4,7,7,9]
target = 4
(left, right) = (0, len(nums)-1)
result = -1 # Initial value
index = binarySearchRecursive(nums, left, right, target, result)
if index != -1:
print(index)
else:
print('Not found')
From your updated version, I still feel the exit point of your function is a little unintuitive.(Your "trivial case" section)
Since the only condition that your searching should stop, is that you have searched all possible section of the list. That is when the range of searching area is 0, there is no element left to be search and check. In implementation, that is when left < right, or high < low, is true.
The 'result' variable, is initialized as -1 when the function first been called from main. And won't change if there is no match find. And after each successful matching, since we can not be sure if it is the first occurrence, we will just store this index into the result. If there are more 'left matching', then the value will be update. If there is not, then the value will be eventually returned. If the target is not in the list, the return will be -1, as its original initialized value.

Related

How to make this greedy function faster?

I am trying to solve a problem and my code fails at one test case where the list is of length 25000. Is there any way I can make this faster. I tried using functools.lru_cache and I still can not run within the time required to complete.
This is the problem from the site
Given an array of non-negative integers nums, you are initially
positioned at the first index of the array.
Each element in the array represents your maximum jump length at that
position.
Determine if you are able to reach the last index.
This is what I have tried
def can_jump(input_list):
#lru_cache
def helper(idx = 0):
if idx == len(input_list) - 1:
return True
return any(helper(new_idx) for x in range(input_list[idx]) \
if (new_idx := idx + x + 1) < len(input_list)) # increasing order of jumps
return helper()
Sample test cases work
input_list = [2,3,1,1,4]
print(can_jump(input_list)) # True
input_list = [3,2,1,0,4]
print(can_jump(input_list)) # False
I have also tried going from the other direction,
return any(helper(new_idx) for x in range(input_list[idx], 0, -1) \
if (new_idx := idx + x) < len(input_list)) # decreasing order of jumps
But I still can not make this run fast enough to clear the last test case of 25000 element list, what is it that I am doing wrong here?

Ok, I think I get it. Can you try this? Please note, this is taken straight from: https://codereview.stackexchange.com/questions/223612/jump-game-leetcode
def canjump(nums):
maximum_reachable_index = 0
for index, max_jump in enumerate(nums):
if index > maximum_reachable_index:
return False
maximum_reachable_index = max(index + max_jump, maximum_reachable_index)
return True

recursive binary search of a sorted sublist

I am trying to implement a recursive binary search algorithm which takes 4 arguments, a list, first, the integer index of the first item in the sorted sub-sequence, last, the integer index of the last item in the sorted sub-sequence and a target which will be compared to the values stored in the list.
The algorithm needs to return the position of the target within the sorted sub-sequence (if it exists) and if not return the position in which it should be placed within the sorted sub-sequence.
Here's what I have thus far;
def binary_search(a_list, first, last, target):
subMidpoint = (first + last) // 2
if a_list[subMidpoint] == target:
return subMidpoint
else:
if target < a_list[subMidpoint]:
last = subMidpoint -1
return binarySearch(a_list, first, last, target)
else:
first = subMidpoint +1
return binarySearch(a_list, first, last, target)
return first
I am struggling to wrap my head around how it will return the position if the item does not exist, any help would be greatly appreciated. The code currently compiles however is returning 'None' rather than an index position.
Many Thanks in advance.
Edit;
Thanks all for your help, I have managed to alter the final clause and it has passed some tests however it fails when the target is less than the smallest value in first and when the target is greater than the value in last.
Here's the altered final clause.
else:
if target < a_list[subMidpoint]:
last = subMidpoint -1
return binary_search(a_list, first, last, target)
else:
first = subMidpoint +1
return first

You almost have your answer in your description: if you get down to adjacent items, say positions 5 and 6, and you haven't found the item, then it would be inserted between those two. Since list indices grow to the upper end, you'd return the higher of the two -- 6, in this case.
Thus, your logic would be in your last clause
else:
if subMidpoint == first:
return last
else:
first = subMidpoint +1
return binarySearch(a_list, first, last, target)
Drop that return first at the bottom; you should not be able to reach that statement.
Learn the elif keyword; your program will be more readable.

Solved, thanks everyone. Not the cleanest solution but it works.
def binary_search(a_list, first, last, target):
subMidpoint = (first + last) // 2
if target < a_list[first]:
return first
elif target > a_list[last]:
return last +1
elif a_list[subMidpoint] == target:
return subMidpoint
elif target < a_list[subMidpoint]:
last = subMidpoint -1
return binary_search(a_list, first, last, target)
else:
first = subMidpoint +1
return first

Binary search: weird middle point calculation

Regarding calculation of the list mid-point: why is there
i = (first +last) //2
and last is initialized to len(a_list) - 1? From my quick tests, this algorithm without -1 works correctly.
def binary_search(a_list, item):
"""Performs iterative binary search to find the position of an integer in a given, sorted, list.
a_list -- sorted list of integers
item -- integer you are searching for the position of
"""
first = 0
last = len(a_list) - 1
while first <= last:
i = (first + last) / 2
if a_list[i] == item:
return '{item} found at position {i}'.format(item=item, i=i)
elif a_list[i] > item:
last = i - 1
elif a_list[i] < item:
first = i + 1
else:
return '{item} not found in the list'.format(item=item)

The last legal index is len(a_list) - 1. The algorithm will work correctly, as first will always be no more than this, so that the truncated mean will never go out of bounds. However, without the -1, the midpoint computation will be one larger than optimum about half the time, resulting in a slight loss of speed.

Consider the case where the item you're searching for is greater than all the elements of the list. In that case the statement first = i + 1 gets executed repeatedly. Finally you get to the last iteration of the loop, where first == last. In that case i is also equal to last, but if last=len() then i is off the end of the list! The first if statement will fail with an index out of range.
See for yourself: https://ideone.com/yvdTzo
You have another error in that code too, but I'll let you find it for yourself.

Find index of list using binary recursive function

So, ive been instructed to create a function with 2 parameters, a list and a number, that uses a binary recursive search to see if a number is in a list. If the number is in the list i'm to return its index and if its not I am to return -1. So far i have
def findIndex(alist,num):
print(alist)
if len(alist) % 2 == 0:
mid = int((len(alist)/2)-1)
else:
mid = ((len(alist)//2))
if alist[mid] == num:
print(mid)
elif alist[mid] > num:
findIndex(alist[0:mid],num)
elif alist[mid] < num:
findIndex(alist[mid+1:],num)
I know how a binary search works. Do to the middle, if its not the number you're searching for compare that number to the number you're searching for. If its greater than the number youre searching for, search the front half of the list. If its lesser, search the back half of the list again. The problem is my code only works in the case that the number I'm searching for is less than the middle number in every case.

ANALYSIS
There are several problems with the logic.
The deleted post nailed your most glaring problem: your search works only when the search target appears in the middle of a series of left-only divisions. Otherwise, you print 0, the index when the list gets down to a single item.
If the target is not in the list, your program crashes on index out of range, when you try to find the midpoint of an empty list.
You never return anything. Printing a result is not the same as returning a value.
SOLUTION
There are two straightforward ways to handle this. The first is to use findIndex as a wrapper function, and write the function you want to be called by that. For instance:
def findIndex(alist,num):
return binaryFind(alist, 0, len(alist), num)
def binaryFind(alist, left, right, target):
# Here, you write a typical binary search function
# with left & right limits.
The second is to return the index you find, but adjust it for all of the times you cut off the left half of the list. Each level of call has to add that adjustment to the return value, passing the sum back to the previous level. The simple case looks like this, where you're recurring on the right half of the list:
elif alist[mid] < num:
return (mid+1) + findIndex(alist[mid+1:], num)
Does that get you moving toward a useful solution?

Python code for Binary Search Algorithm does not compile [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I'm trying to implement a Binary Search algorithm in Python.
I wrote this code on my phone but it didn't compile.
And I don't know why it doesn't compile.
(I haven't tried it on a computer.)
def BinarySearch(sList, search):
res=0
sortedList=sorted(sList)
x=int(len(sortedList)/2)
mid=sortedList[x]
def DivideSearch(sortedList ,search ):
first=sortedList[:mid ]
second=sortedList[mid:]
if first[len(first)-1]<search:
DivideSearch (first, search)
elif second[len(second)-1]>search:
DivideSearch (second, search)
elif len(first)==1:
res=first.pop()
elif len(second)==1:
res=second.pop()
if res==search:
return res
numbers=[1,2,3,4,5,6,7,8,9]
guess=3
print(BinarySearch(numbers,guess ))
What keeps this code from compiling?
What are my mistakes and how can I fix them?

First, your code is running fine on my machine. Second, your logic is flawed. res never gets assigned in the BinarySearch() function because it is in a different scope than in the parent function. Also, your base case check should not be done on first or second it should be done on sortedList at the beginning of the function. Also, you can do your checking if the value was found in the DivideSearch() function. I'm uploading corrected code, take a look at this
import random
def DivideSearch(sortedList, search, mid):
first = sortedList[:mid]
second = sortedList[mid:]
#check for our base case
if len(sortedList) ==1:
return sortedList[0] if sortedList[0] == search else None
#we can immediately remove half the cases if they're less than or greater than our search value
#if the greatest value element in the lower half is < search value, only search the higher value list
if first[-1] < search:
#recurse
return DivideSearch(second, search, len(second)/2)
#otherwise we want to check the lower value list
else:
return DivideSearch(first, search, len(first)/2)
def BinarySearch(sList, search):
sortedList=sorted(sList)
#determine mid cleanup
mid=len(sortedList)/2
#return the result
return DivideSearch(sortedList,search, mid)
numbers=[random.randint(1, 10) for x in range(1,10)]
guess=5

def binarys(list, item):
#you have to keep track of the head(lo) of the list and tail(hi)
lo = 0
#a list must be sorted for binary search to work because the lower values are on the left and higher on the right
slist = sorted(list)
hi = len(slist) - 1
#Keep running the search as long as the start of the list is never less than or equal to the end of the list. At that point you either have 1 item left to check or the item isn't there at all. So return False
while lo <= hi:
mid = (lo + hi)//2
#if the item you searched for is in the middle, return True
if slist[mid] == item:
return True
#since it's not in the middle the first time you checked, but if the item you're looking for is less than the value of the mid item in the list then you can ignore the entire right part of the list by making the item to the left of the midpoint the new tail(hi). midpoint minus 1 because you already established the midpoint of the original list didn't have the item you searched for.
elif item < slist[mid]:
hi = mid - 1
# if the item you're looking for is greater than the value of the mid item in the list then you can ignore the entire left part of the list by making the item to the right of the midpoint the new head(lo). midpoint plus 1 because you already established the midpoint of the original list didn't have the item you searched for.
else:
if item > slist[mid]:
lo = mid+ 1
return False
print(binarys([1,2,3,4,5,6,7,8,9,10], 1))

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Binary search on array with duplicate - python

Related

How to make this greedy function faster?

recursive binary search of a sorted sublist

Binary search: weird middle point calculation

Find index of list using binary recursive function

Python code for Binary Search Algorithm does not compile [closed]

Categories

Resources