Divide and Conquer. Find the majority of element in array - python

I am working on a python algorithm to find the most frequent element in the list.
def GetFrequency(a, element):
return sum([1 for x in a if x == element])
def GetMajorityElement(a):
n = len(a)
if n == 1:
return a[0]
k = n // 2
elemlsub = GetMajorityElement(a[:k])
elemrsub = GetMajorityElement(a[k:])
if elemlsub == elemrsub:
return elemlsub
lcount = GetFrequency(a, elemlsub)
rcount = GetFrequency(a, elemrsub)
if lcount > k:
return elemlsub
elif rcount > k:
return elemrsub
else:
return None
I tried some test cases. Some of them are passed, but some of them fails.
For example, [1,2,1,3,4] this should return 1, buit I get None.
The implementation follows the pseudocode here:
http://users.eecs.northwestern.edu/~dda902/336/hw4-sol.pdf
The pseudocode finds the majority item and needs to be at least half. I only want to find the majority item.
Can I get some help?
Thanks!

I wrote an iterative version instead of the recursive one you're using in case you wanted something similar.
def GetFrequency(array):
majority = int(len(array)/2)
result_dict = {}
while array:
array_item = array.pop()
if result_dict.get(array_item):
result_dict[array_item] += 1
else:
result_dict[array_item] = 1
if result_dict[array_item] > majority:
return array_item
return max(result_dict, key=result_dict.get)
This will iterate through the array and return the value as soon as one hits more than 50% of the total (being a majority). Otherwise it goes through the entire array and returns the value with the greatest frequency.

def majority_element(a):
return max([(a.count(elem), elem) for elem in set(a)])[1]
EDIT
If there is a tie, the biggest value is returned. E.g: a = [1,1,2,2] returns 2. Might not be what you want but that could be changed.
EDIT 2
The pseudocode you gave divided into arrays 1 to k included, k + 1 to n. Your code does 1 to k - 1, k to end, not sure if it changes much though ? If you want to respect the algorithm you gave, you should do:
elemlsub = GetMajorityElement(a[:k+1]) # this slice is indices 0 to k
elemrsub = GetMajorityElement(a[k+1:]) # this one is k + 1 to n.
Also still according to your provided pseudocode, lcount and rcount should be compared to k + 1, not k:
if lcount > k + 1:
return elemlsub
elif rcount > k + 1:
return elemrsub
else:
return None
EDIT 3
Some people in the comments highligted that provided pseudocode solves not for the most frequent, but for the item which is present more that 50% of occurences. So indeed your output for your example is correct. There is a good chance that your code already works as is.
EDIT 4
If you want to return None when there is a tie, I suggest this:
def majority_element(a):
n = len(a)
if n == 1:
return a[0]
if n == 0:
return None
sorted_counts = sorted([(a.count(elem), elem) for elem in set(a)], key=lambda x: x[0])
if len(sorted_counts) > 1 and sorted_counts[-1][0] == sorted_counts[-2][0]:
return None
return sorted_counts[-1][1]

Related

Code for consecutive strings works but can't pass random tests

In this problem, I'm given an array(list) strarr of strings and an integer k. My task is to return the first longest string consisting of k consecutive strings taken in the array. My code passed all the sample tests from CodeWars but can't seem to pass the random tests.
Here's the link to the problem.
I did it in two days. I found the max consecutively combined string first. Here's the code for that.
strarr = []
def longest_consec(strarr, k):
strarr.append('')
length = len(strarr)
cons_list = []
end = k
start = 0
freq = -length/2
final_string = []
largest = max(strarr, key=len, default='')
if k == 1:
return largest
elif 1 < k < length:
while(freq <= 1):
cons_list.append(strarr[start:end])
start += k-1
end += k-1
freq += 1
for index in cons_list:
final_string.append(''.join(index))
return max(final_string, key=len, default='')
else:
return ""
Since that didn't pass all the random tests, I compared the combined k strings on both sides of the single largest string. But, this way, the code doesn't account for the case when the single largest string is in the middle. Please help.
strarr = []
def longest_consec(strarr, k):
strarr.append('')
length = len(strarr)
largest = max(strarr, key=len, default='')
pos = int(strarr.index(largest))
if k == 1:
return largest
elif 1 < k < length:
prev_string = ''.join(strarr[pos+1-k:pos+1])
next_string = ''.join(strarr[pos:pos+k])
if len(prev_string) >= len(next_string):
res = prev_string
else:
res = next_string
return res
else:
return ""
print(longest_consec(["zone", "abigail", "theta", "form", "libe"], 2))
Let's start from the first statement of your function:
if k == 1:
while(p <= 1):
b.append(strarr[j:i])
j += 1
i += 1
p += 1
for w in b:
q.append(''.join(w))
return max(q, key=len)
Here q is finally equal strarr so you can shorten this code to:
if k == 1:
return max(strarr, key=len)
I see that second statement's condition checks if k value is between 1 and length of string array inclusive:
elif k > 1 and k <= 2*a:
...
If you want no errors remove equality symbol, last element of every array has index lesser than its length (equal exactly length of it minus 1).
Ceiling and division is not necessary in a definition, so you can shorten this:
a = ceil(len(strarr)/2)
into this:
a = len(strarr)
then your elif statement may look like below:
elif 1 < k < a: # Same as (k > 1 and k < a)
...
again, I see you want to concatenate (add) the longest string to k next strings using this code:
while(p <= 1):
b.append(strarr[j:i])
j += k-1
i += k-1
p += 1
for w in b:
q.append(''.join(w))
return max(q, key=len)
the more clearer way of doing this:
longest = max(strarr, key=len) # Longest string in array.
index = 0 # Index of the current item.
for string in strarr:
# If current string is equal the longest one ...
if string == longest:
# Join 'k' strings from current index (longest string index).
return ''.join(strarr[index:index + k])
index += 1 # Increase current index.
And the last statement which is:
elif k > 2*a or k<1:
return ""
if all previous statements failed then value is invalid so you can instead write:
return "" # Same as with else.
Now everything should work. I advice you learning the basics (especially lists, strings and slices), and please name your variables wisely so they are more readable.
You can try this as well
this has passed all the test cases on the platform you suggested.
def longest_consec(strarr, k):
i = 0
max_ = ""
res = ""
if (k<=0) or (k>len(strarr)):
return ""
while i<=(len(strarr)-k):
start = "".join(strarr[i:i+k])
max_ = max(max_, start, key=len)
if max_==start:
res=strarr[i:i+k]
i+=1
return max_
#output: ["zone", "abigail", "theta", "form", "libe", "zas", "theta", "abigail"], 2 -> abigailtheta
#output: ["zones", "abigail", "theta", "form", "libe", "zas", "theta", "abigail"],2 -> zonesabigail

Count the number of sub arrays such that each element in the subarray appears at least twice

I have come across a problem and can't seem to come up with an efficient solution. The problem is as follows:
Given an array A, count the number of consecutive contiguous subarrays such that each element in the subarray appears at least twice.
Ex, for:
A = [0,0,0]
The answer should be 3 because we have
A[0..1] = [0,0]
A[1..2] = [0,0]
A[0..3] = [0,0,0]
Another example:
A=[1,2,1,2,3]
The answer should be 1 because we have:
A[0..3] = [1,2,1,2]
I can't seem to come up with an efficient solution for this algorithm. I have an algorithm that checks every possible subarray (O(n^2)), but I need something better. This is my naive solution:
def duplicatesOnSegment(arr):
total = 0
for i in range(0,len(arr)):
unique = 0
test = {}
for j in range(i,len(arr)):
k = arr[j]
if k not in test:
test[k] = 1
else:
test[k] += 1
if test[k] == 1:
unique += 1
elif test[k] == 2:
unique -= 1
if unique == 0:
total += 1
return total
Your program in the worst case is greater than O(n^2) as you are using if k not in test in nested loop. This in in worst case is O(n) resulting in overall worst case O(n^3). I have this, O(n^2) in worst, solution which uses collections.defaultdict as hash to make this faster.
from collections import defaultdict
def func(A):
result = 0
for i in range(0,len(A)):
counter = 0
hash = defaultdict (int)
for j in range (i, len(A)):
hash[A[j]] += 1
if hash[A[j]] == 2:
counter += 1
if counter != 0 and counter == len(hash):
result += 1
return result
To start with, I would consider the negation of the property of interest :
not(all(x in l appears at least twice)) = exists(x in l such that any other y in l != x)
Not sure yet that the previous reformulation of your question might help, but my intuition as a mathematician told me to try this way... Still feels like O(n^2) though...
my_list = ['b', 'b', 'a', 'd', 'd', 'c']
def remaining_list(el,list):
assert el in list
if el == list[-1]: return []
else: return list[list.index(el)+1:]
def has_duplicate(el, list):
assert el in list
return el in remaining_list(el,list)
list(filter(lambda e:not(has_duplicate(e,my_list)),my_list))

Returning smallest positive int that does not occur in given list

Write a function that given an array of A of N int, returns the smallest positive(greater than 0) that does not occur in A.
I decided to approach this problem by iterating through the list after sorting it.
The value of the current element would be compared to the value of the next element. Because the list is sorted, the list should follow sequentially until the end.
However, if there is a skipped number this indicates the smallest number that does not occur in the list.
And if it follows through until the end, then you should just add one to the value of the last element.
def test():
arr = [23,26,25,24,28]
arr.sort()
l = len(arr)
if arr[-1] <= 0:
return 1
for i in range(0,l):
for j in range(1,l):
cur_val = arr[i]
next_val = arr[j]
num = cur_val + 1
if num != next_val:
return num
if num == next_val: //if completes the list with no skips
return arr[j] + 1
print(test())
I suggest that you convert to a set, and you can then efficiently test whether numbers are members of it:
def first_int_not_in_list(lst, starting_value=1):
s = set(lst)
i = starting_value
while i in s:
i += 1
return i
arr = [23,26,25,24,28]
print(first_int_not_in_list(arr)) # prints 1
You can do the following:
def minint(arr):
s=set(range(min(arr),max(arr)))-set(arr)
if len(s)>0:
return min(set(range(min(arr),max(arr)))-set(arr)) #the common case
elif 1 in arr:
return max(arr)+1 #arr is a complete range with no blanks
else:
return 1 #arr is negative numbers only
You can make use of sets to achieve your goal.
set.difference() method is same as relative complement denoted by A – B, is the set of all elements in A that are not in B.
Example:
Let A = {1, 3, 5} and B = {1, 2, 3, 4, 5, 6}. Then A - B = {2, 4, 6}.
Using isNeg() method is used to check whether given set contains any negative integer.
Using min() method on A - B returns the minimum value from set difference.
Here's the code snippet
def retMin(arrList):
min_val = min(arrList) if isNeg(arrList) else 1
seqList=list(range((min_val),abs(max(arrList))+2))
return min(list(set(seqList).difference(arrList)))
def isNeg(arr):
return(all (x > 0 for x in arr))
Input:
print(retMin([1,3,6,4,1,2]))
Output:
5
Input:
print(retMin([-2,-6,-7]))
Output:
1
Input:
print(retMin([23,25,26,28,30]))
Output:
24
Try with the following code and you should be able to solve your problem:
def test():
arr = [3,-1,23,26,25,24,28]
min_val = min(val for val in arr if val > 0)
arr.sort()
l = len(arr)
if arr[-1] <= 0:
return 1
for i in range(0,l):
if arr[i] > 0 and arr[i] <= min_val:
min_val = arr[i] + 1
return min_val
print(test())
EDIT
It seems you're searching for the the value grater than the minimum positive integer in tha array not sequentially.
The code it's just the same as before I only change min_val = 1 to:
min_val = min(val for val in arr if val > 0), so I'm using a lambda expression to get all the positive value of the array and after getting them, using the min function, I'll get the minimum of those.
You can test it here if you want

python3 binary search not working [duplicate]

I am trying to implement the binary search in python and have written it as follows. However, I can't make it stop whenever needle_element is larger than the largest element in the array.
Can you help? Thanks.
def binary_search(array, needle_element):
mid = (len(array)) / 2
if not len(array):
raise "Error"
if needle_element == array[mid]:
return mid
elif needle_element > array[mid]:
return mid + binary_search(array[mid:],needle_element)
elif needle_element < array[mid]:
return binary_search(array[:mid],needle_element)
else:
raise "Error"
It would be much better to work with a lower and upper indexes as Lasse V. Karlsen was suggesting in a comment to the question.
This would be the code:
def binary_search(array, target):
lower = 0
upper = len(array)
while lower < upper: # use < instead of <=
x = lower + (upper - lower) // 2
val = array[x]
if target == val:
return x
elif target > val:
if lower == x: # these two are the actual lines
break # you're looking for
lower = x
elif target < val:
upper = x
lower < upper will stop once you have reached the smaller number (from the left side)
if lower == x: break will stop once you've reached the higher number (from the right side)
Example:
>>> binary_search([1,5,8,10], 5) # return 1
1
>>> binary_search([1,5,8,10], 0) # return None
>>> binary_search([1,5,8,10], 15) # return None
Why not use the bisect module? It should do the job you need---less code for you to maintain and test.
array[mid:] creates a new sub-copy everytime you call it = slow. Also you use recursion, which in Python is slow, too.
Try this:
def binarysearch(sequence, value):
lo, hi = 0, len(sequence) - 1
while lo <= hi:
mid = (lo + hi) // 2
if sequence[mid] < value:
lo = mid + 1
elif value < sequence[mid]:
hi = mid - 1
else:
return mid
return None
In the case that needle_element > array[mid], you currently pass array[mid:] to the recursive call. But you know that array[mid] is too small, so you can pass array[mid+1:] instead (and adjust the returned index accordingly).
If the needle is larger than all the elements in the array, doing it this way will eventually give you an empty array, and an error will be raised as expected.
Note: Creating a sub-array each time will result in bad performance for large arrays. It's better to pass in the bounds of the array instead.
You can improve your algorithm as the others suggested, but let's first look at why it doesn't work:
You're getting stuck in a loop because if needle_element > array[mid], you're including element mid in the bisected array you search next. So if needle is not in the array, you'll eventually be searching an array of length one forever. Pass array[mid+1:] instead (it's legal even if mid+1 is not a valid index), and you'll eventually call your function with an array of length zero. So len(array) == 0 means "not found", not an error. Handle it appropriately.
This is a tail recursive solution, I think this is cleaner than copying partial arrays and then keeping track of the indexes for returning:
def binarySearch(elem, arr):
# return the index at which elem lies, or return false
# if elem is not found
# pre: array must be sorted
return binarySearchHelper(elem, arr, 0, len(arr) - 1)
def binarySearchHelper(elem, arr, start, end):
if start > end:
return False
mid = (start + end)//2
if arr[mid] == elem:
return mid
elif arr[mid] > elem:
# recurse to the left of mid
return binarySearchHelper(elem, arr, start, mid - 1)
else:
# recurse to the right of mid
return binarySearchHelper(elem, arr, mid + 1, end)
def binary_search(array, target):
low = 0
mid = len(array) / 2
upper = len(array)
if len(array) == 1:
if array[0] == target:
print target
return array[0]
else:
return False
if target == array[mid]:
print array[mid]
return mid
else:
if mid > low:
arrayl = array[0:mid]
binary_search(arrayl, target)
if upper > mid:
arrayu = array[mid:len(array)]
binary_search(arrayu, target)
if __name__ == "__main__":
a = [3,2,9,8,4,1,9,6,5,9,7]
binary_search(a,9)
Using Recursion:
def binarySearch(arr,item):
c = len(arr)//2
if item > arr[c]:
ans = binarySearch(arr[c+1:],item)
if ans:
return binarySearch(arr[c+1],item)+c+1
elif item < arr[c]:
return binarySearch(arr[:c],item)
else:
return c
binarySearch([1,5,8,10,20,50,60],10)
All the answers above are true , but I think it would help to share my code
def binary_search(number):
numbers_list = range(20, 100)
i = 0
j = len(numbers_list)
while i < j:
middle = int((i + j) / 2)
if number > numbers_list[middle]:
i = middle + 1
else:
j = middle
return 'the index is '+str(i)
If you're doing a binary search, I'm guessing the array is sorted. If that is true you should be able to compare the last element in the array to the needle_element. As octopus says, this can be done before the search begins.
You can just check to see that needle_element is in the bounds of the array before starting at all. This will make it more efficient also, since you won't have to do several steps to get to the end.
if needle_element < array[0] or needle_element > array[-1]:
# do something, raise error perhaps?
It returns the index of key in array by using recursive.
round() is a function convert float to integer and make code fast and goes to expected case[O(logn)].
A=[1,2,3,4,5,6,7,8,9,10]
low = 0
hi = len(A)
v=3
def BS(A,low,hi,v):
mid = round((hi+low)/2.0)
if v == mid:
print ("You have found dude!" + " " + "Index of v is ", A.index(v))
elif v < mid:
print ("Item is smaller than mid")
hi = mid-1
BS(A,low,hi,v)
else :
print ("Item is greater than mid")
low = mid + 1
BS(A,low,hi,v)
BS(A,low,hi,v)
Without the lower/upper indexes this should also do:
def exists_element(element, array):
if not array:
yield False
mid = len(array) // 2
if element == array[mid]:
yield True
elif element < array[mid]:
yield from exists_element(element, array[:mid])
else:
yield from exists_element(element, array[mid + 1:])
Returning a boolean if the value is in the list.
Capture the first and last index of the list, loop and divide the list capturing the mid value.
In each loop will do the same, then compare if value input is equal to mid value.
def binarySearch(array, value):
array = sorted(array)
first = 0
last = len(array) - 1
while first <= last:
midIndex = (first + last) // 2
midValue = array[midIndex]
if value == midValue:
return True
if value < midValue:
last = midIndex - 1
if value > midValue:
first = midIndex + 1
return False

Python Greedy Sum

I am currently working on this code and the only thing that seems to work is the "no solution." Also it seems that the code has an infinite loop and I can't seem to figure out how to solve it. If someone could point out my mistake it would be appreciated.
def greedySum(L, s):
""" input: s, positive integer, what the sum should add up to
L, list of unique positive integers sorted in descending order
Use the greedy approach where you find the largest multiplier for
the largest value in L then for the second largest, and so on to
solve the equation s = L[0]*m_0 + L[1]*m_1 + ... + L[n-1]*m_(n-1)
return: the sum of the multipliers or "no solution" if greedy approach does
not yield a set of multipliers such that the equation sums to 's'
"""
if len(L) == 0:
return "no solution"
sum_total = (0, ())
elif L[0] > k:
sum_total = greed(L[1:], k)
else:
no_number = L[0]
value_included, take = greed(L, k - L[0])
value_included += 1
no_value, no_take = greed(L[1:], k)
if k >= 0:
sum_total = (value_included, take + (no_number,))
else:
sum_total = (value_included, take + (no_number,))
return sum_total
sum_multiplier = greed(L, s)
return "no solution" if sum(sum_multiplier[1]) != s else sum_multiplier[0]
Second method:
def greedySum(L, s):
answer = []
try:
while (s >= L[0]):
total = s// L[0]
s -= (total * L[0])
answer.append(total)
L = L[1:]
return(str(answer)[1:-1])
except:
return("no solution")
Here is something that works:
def greedySum(L, s):
multiplier_sum = 0
for l in L:
(quot,rem) = divmod(s,l) # see how many 'l's you can fit in 's'
multiplier_sum += quot # add that number to the multiplier_sum
s = rem # update the remaining amount
# If at the end and s is 0, return the multiplier_sum
# Otherwise, signal that there is no solution
return multiplier_sum if s == 0 else "no solution"
I would offer more help on what is wrong with your code, but that is for the moment a moving target - you keep changing it!
>>> greedySum([4,2],8)
2
>>> greedySum([4,2],9)
'no solution'
>>> greedySum([4,2,1],9)
3

Categories