Big O of backtracking solution counts permutations with range

Big O of backtracking solution counts permutations with range - python

I have a problem and I've been struggling with my solution time and space complexity:
Given an array of integers (possible duplicates) A and min, low, high are integers.
Find the total number of combinations of items in A that:
low <= A[i] <= high
Each combination has at least min numbers.
Numbers in one combination can be duplicates as they're considered unique in A but combinations can not be duplicates. E.g.: [1,1,2] -> combinations: [1,1],[1,2],[1,1,2] are ok but [1,1],[1,1], [1,2], [2,1] ... are not.
Example: A=[4, 6, 3, 13, 5, 10], min = 2, low = 3, high = 5
There are 4 ways to combine valid integers in A: [4,3],[4,5],[4,3,5],[3,5]
Here's my solution and it works:
class Solution:
def __init__(self):
pass
def get_result(self, arr, min_size, low, high):
return self._count_ways(arr, min_size, low, high, 0, 0)
def _count_ways(self, arr, min_size, low, high, idx, comb_size):
if idx == len(arr):
return 0
count = 0
for i in range(idx, len(arr)):
if arr[i] >= low and arr[i] <= high:
comb_size += 1
if comb_size >= min_size:
count += 1
count += self._count_ways(arr, min_size, low, high, i + 1, comb_size)
comb_size -= 1
return count
I use backtracking so:
Time: O(n!) because for every single integer, I check with each and every single remaining one in worst case - when all integers can form combinations.
Space: O(n) for at most I need n calls on the call stack and I only use 2 variables to keep track of my combinations.
Is my analysis correct?
Also, a bit out of the scope but: Should I do some kind of memoization to improve it?

If I understand your requirements correctly, your algorithm is far too complicated. You can do it as follows:
Compute array B containing all elements in A between low and high.
Return sum of Choose(B.length, k) for k = min .. B.length, where Choose(n,k) is n(n-1)..(n-k+1)/k!.
Time and space complexities are O(n) if you use memoization to compute the numerators/denominators of the Choose function (e.g. if you have already computed 5*4*3, you only need one multiplication to compute 5*4*3*2 etc.).
In your example, you would get B = [4, 3, 5], so B.length = 3, and the result is
Choose(3, 2) + Choose(3, 3)
= (3 * 2)/(2 * 1) + (3 * 2 * 1)/(3 * 2 * 1)
= 3 + 1
= 4

Your analysis of the time complexity isn't quite right.
I understand where you're getting O(n!): the for i in range(idx, len(arr)): loop decreases in length with every recursive call, so it seems like you're doing n*(n-1)*(n-2)*....
However, the recursive calls from a loop of length m do not always contain a loop of size m-1. Suppose your outermost call has 3 elements. The loop iterates through 3 possible values, each spawning a new call. The first such call will have a loop that iterates over 2 values, but the next call iterates over only 1 value, and the last immediately hits your base case and stops. So instead of 3*2*1=((1+1)+(1+1)+(1+1)), you get ((1+0)+1+0).
A call to _count_ways with an array of size n takes twice as long as a call with size n-1. To see this, consider the first branch in the call of size n which is to choose the first element or not. First we choose that first element, which leads to a recursive call with size n-1. Second we do not choose that first element, which gives us n-1 elements left to iterate over, so it's as if we had a second recursive call with size n-1.
Each increase in n increase time complexity by a factor of 2, so the time complexity of your solution is O(2^n). This makes sense: you're checking every combination, and there are 2^n combinations in a set of size n.
However, as you're only trying to count the combinations and not do something with them, this is highly inefficient. See #Mo B.'s answer for a better solution.

Related

Time complexity for a greedy recursive algorithm

I have coded a greedy recursive algorithm to Find minimum number of coins that make a given change. Now I need to estimate its time complexity. As the algorithm has nested "ifs" depending on the same i (n * n), with the inner block halving the recursive call (log(2)n), I believe the correct answer could be O(n*log(n)), resulting from the following calculation:
n * log2(n) * O(1)
Please, give me your thoughts on whether my analysis is correct and feel free to also suggest improvements on my greedy recursive algorithm.
This is my recursive algorithm:
coins = [1, 5, 10, 21, 25]
coinsArraySize = len(coins)
change = 63
pickedCoins = []
def findMin(change, i, pickedCoins):
if (i>=0):
if (change >= coins[i]):
pickedCoins.append(coins[i])
findMin(change - coins[i], i, pickedCoins)
else:
findMin(change, i-1, pickedCoins)
findMin(change, coinsArraySize-1, pickedCoins)

Each recursive call decreases change by at least 1, and there is no branching (that is, your recursion tree is actually a straight line, so no recursion is actually necessary). Your running time is O(n).

What is n? The runtime depends on both the amount and the specific coins. For example, suppose you have a million coins, 1 through 1,000,000, and try to make change for 1. The code will go a million recursive levels deep before it finally finds the largest coin it can use (1). Same thing in the end if you have only one coin (1) and try to make change for 1,000,000 - then you find the coin at once, but go a million levels deep picking that coin a million times.
Here's a non-recursive version that improves on both of those: use binary search to find the next usable coin, and once a suitable coin is found use it as often as possible.
def makechange(amount, coins):
from bisect import bisect_right
# assumes `coins` is sorted. and that coins[0] > 0
right_bound = len(coins)
result = []
while amount > 0:
# Find largest coin <= amount.
i = bisect_right(coins, amount, 0, right_bound)
if not i:
raise ValueError("don't have a coin <=", amount)
coin = coins[i-1]
# How many of those can we use?
n, amount = divmod(amount, coin)
assert n >= 1
result.extend([coin] * n)
right_bound = i - 1
return result
It still takes O(amount) time if asked to make change for a million with the only coin being 1, but because it has to build a result list with a million copies of 1. If there are a million coins and you ask for change for 1, though, it's O(log2(len(coins))) time. The first could be slashed by changing the output format to a dict, mapping a coin to the number of times that coin is used. Then the first case would be cut to O(1) time.
As is, the time it takes is proportional to the length of the result list, plus some (usually trivial) time for a number of binary searches equal to the number of distinct coins used. So "a bad case" is one where every coin needs to be used; e.g.,
>>> coins = [2**i for i in range(10)]
>>> makechange(sum(coins), coins)
[512, 256, 128, 64, 32, 16, 8, 4, 2, 1]
That's essentially O(n + n log n) where n is len(coins).
Finding an optimal solution
As #Stef noted in a comment, the greedy algorithm doesn't always find a minimal number of coins. That's substantially harder. The usual approach is via dynamic programming, with worst case O(amt * len(coins)) time. But that's also best case: it works "bottom up", finding the smallest number of coins to reach 1, then 2, then 3, then 4, ..., and finally amt.
So I'm going to suggest a different approach, using breadth-first tree search, working down from the initial amount until reaching 0. Worst-case O() behavior is the same, but best-case time is far better. For the comment's:
mincoins(10000, [1, 2000, 3000])
case it looks at less than 20 nodes before finding the optimal 4-coin solution. Because it's a breadth-first search, it knows there's no possible shorter path to the root, so can stop right away then.
For a worst-case example, try
mincoins(1000001, range(2, 200, 2))
All the coins are even numbers, so it's impossible for any collection of them to sum to the odd target. The tree has to be expanded half a million levels deep before it realizes 0 is unreachable. But while the branching factor at high levels is O(len(coins)), the total number of nodes in the entire expanded tree is bounded above by amt + 1 (pigeonhole principle: the dict can't have more than amt + 1 keys, so any number of nodes beyond that are necessarily duplicate targets and so are discarded as soon as they're generated). So, in reality, the tree in this case grows wide very quickly, but then quickly becomes very narrow and very deep.
Also note that this approach makes it very easy to reconstruct the minimal collection of coins that sum to amt.
def mincoins(amt, coins):
from collections import deque
coins = sorted(set(coins)) # increasing, no duplicates
# Map amount to coin that was subtracted to reach it.
a2c = {amt : None}
d = deque([amt])
while d:
x = d.popleft()
for c in coins:
y = x - c
if y < 0:
break # all remaining coins too large
if y in a2c:
continue # already found cheapest way to get y
a2c[y] = c
d.append(y)
if not y: # done!
d.clear()
break
if 0 not in a2c:
raise ValueError("not possible", amt, coins)
picks = []
a = 0
while True:
c = a2c[a]
if c is None:
break
picks.append(c)
a += c
assert a == amt
return sorted(picks)

Maximum non-contiguous sum of values in list less than or equal to k

I have a list of values [6,1,1,5,2] and a value k = 10. I want to find the maximum sum of values from the list that is less than or equal to k, return the value and the numbers used. In this case the output would be: 10, [6,1,1,2].
I was using this code from GeeksForGeeks as an example but it doesn't work correctly (in this case, the code's result is 9).
The values do not need to be contiguous - they can be in any order.
def maxsum(arr, n, sum):
curr_sum = arr[0]
max_sum = 0
start = 0;
for i in range(1, n):
if (curr_sum <= sum):
max_sum = max(max_sum, curr_sum)
while (curr_sum + arr[i] > sum and start < i):
curr_sum -= arr[start]
start += 1
curr_sum += arr[i]
if (curr_sum <= sum):
max_sum = max(max_sum, curr_sum)
return max_sum
if __name__ == '__main__':
arr = [6, 1, 1, 5, 2]
n = len(arr)
sum = 10
print(maxsum(arr, n, sum))
I also haven't figured out how to output the values that are used for the sum as a list.

This problem is at least as hard as the well-studied subset sum problem, which is NP-complete. In particular, any algorithm which solves your problem can be used to solve the subset sum problem, by finding the maximum sum <= k and then outputting True if the sum equals k, or False if the sum is less than k.
This means your problem is NP-hard, and there is no known algorithm which solves it in polynomial time. Your algorithm's running time is linear in the length of the input array, so it cannot correctly solve the problem, and no similar algorithm can correctly solve the problem.
One approach that can work is a backtracking search - for each element, try including it in the sum, then backtrack and try not including it in the sum. This will take exponential time in the length of the input array.
If your array elements are always integers, another option is dynamic programming; there is a standard dynamic programming algorithm which solves the integer subset sum problem in pseudopolynomial time, which could easily be adapted to solve your form of the problem.

Here's a solution using itertools.combinations. It's fast enough for small lists, but slows down significantly if you have a large sum and large list of values.
from itertools import combinations
def find_combo(values, k):
for num_sum in range(k, 0, -1):
for quant in range(1, len(values) + 1):
for combo in combinations(values, quant):
if sum(combo) == num_sum:
return combo
values = [6, 1, 1, 5, 2]
k = 10
answer = find_combo(values, k)
print(answer, sum(answer))
This solution works for any values in a list and any k, as long as the number of values needed in the solution sum doesn't become large.
The solution presented by user10987432 has a flaw that this function avoids, which is that it always accepts values that keep the sum below k. With that solution, the values are ordered from largest to smallest and then iterated through and added to the solution if it doesn't bring the sum higher than k. However a simple example shows this to be inaccurate:
values = [7, 5, 4, 1] k = 10
In that solution, the sum would begin at 0, then go up to 7 with the first item, and finish at 8 after reaching the last index. The correct solution, however, is 5 + 4 + 1 = 10.

Solution Performance (DP, hashtable) for Partition Equal Subset Sum

I know there are some related questions have already been asked in stackoverflow. However, this question is more related to the performance difference among 3 approaches.
The question is: Given a non-empty array containing only positive integers, find if the array can be partitioned into two subsets such that the sum of elements in both subsets is equal. https://leetcode.com/problems/partition-equal-subset-sum/
i.e [1, 5, 11, 5] = True, [1, 5, 9] = False
By solving this problem, I have tried 3 approaches:
Approach 1: Dynamic Programming. Top to Bottom Recursion + memorisation (Result: Time Limit Exceeded):
def canPartition(nums):
total, n = sum(nums), len(nums)
if total & 1 == 1: return False
half = total >> 1
mem = [[0 for _ in range(half)] for _ in range(n)]
def dp(n, half, mem):
if half == 0: return True
if n == -1: return False
if mem[n - 1][half - 1]: return mem[n - 1][half - 1]
mem[n - 1][half - 1] = dp(n - 1, half, mem) or dp(n - 1, half - nums[n - 1], mem)
return mem[n - 1][half - 1]
return dp(n - 1, half, mem)
Approach 2: Dynamic Programming. Bottom up. (Result: 2208 ms Accepted):
def canPartition(self, nums):
total, n = sum(nums), len(nums)
if total & 1 == 1: return False
half = total >> 1
matrix = [[0 for _ in range(half + 1)] for _ in range(n)]
for i in range(n):
for j in range(1, half + 1):
if i == 0:
if j >= nums[i]: matrix[i][j] = nums[i]
else: matrix[i][j] = 0
else:
if j >= nums[i]:
matrix[i][j] = max(matrix[i - 1][j], nums[i] + matrix[i - 1][j - nums[i]])
else: matrix[i][j] = matrix[i - 1][j]
if matrix[i][j] == half: return True
return False
Approach 3: HashTable (Dict). Result (172 ms Accepted):
def canPartition(self, nums):
total = sum(nums)
if total & 1 == 0:
half = total >> 1
cur = {0}
for number in nums:
cur |= { number + x for x in cur} # update the dictionary (hashtable) if key doesn't exist
if half in cur: return True
return False
I really don't understand two things for above 3 approaches regarding time complexity:
I would expect the approach 1 and the approach 2 should have the same result. Both is using a table (matrix) to record the calculated state, but why bottom up approach is quicker ?
I don't know why approach 3 is so much quicker compared to the others. Note: as a glance, approach 3 seems to be 2 to Nth Power approach, but it is using dictionary to discard duplicate value, so the time complexity should be T(n * half).

My guess about the difference between approach 1 and the others is that due to the recursion, approach 1 needs to generate significantly more stack frames, which cost more in system resources than just allocating the matrix and iterating over a conditional. But if I were you, I would try to use some kind of process and memory analyzer to better determine and confirm what's happening. Approach 1 assigns a matrix dependent on the range but the algorithm actually limits the number of iterations to potentially far less since the next function call jumps to a sum subtracted by the array element, rather than combing all possibilities.
Approach 3 depends solely on the number of input elements and the number of sums that can be generated. In each iteration, it adds the current number in the input to all previously achievable numbers, adding only new ones to that list. Given the list [50000, 50000, 50000], for example, approach 3 would iterate over at most three sums: 50000, 100000, and 150000. But since it depends on the range, approach 2 would iterate at least 75000 * 3 times!
Given the list [50000, 50000, 50000], approaches 1, 2 and 3 generate the following numbers of iterations: 15, 225000, and 6.

You are right, Approach 1) and 3) have the same time complexity, approach 2 is the DP version of the knapsack(0/1), approach 1 is the branch and bound version. You can improve approach one by pruning the tree through any of the knapsack heuristics but the optimization has to be strict e.g. if the existing sum and the sum of remaining elements at level K is < half then skip it. This way approach 1) can have better computational complexity than 3).
Why are approach 1) and 3) having a different running time,
[To some extent]
That has more to do with the implementation of dictionaries in python. The dictionaries are natively implemented by the Python interpreter, any operation on them is simply going to be faster than anything that needs to be interpreted first and more often. Also, function calls have higher overhead in python, they are objects. So calling one is not a simple bump up stack and jmp/call operation.
[To a large extent]
Another aspect to mull on is the time complexity of the third approach. For approach 3, the only way time complexity can be exponential is if each iteration results in the insertion of as many elements as are there in the dictionary for the current iteration.
cur |= { number + x for x in cur}
The above line should double |cur|.
I think that it is possible for a series like,
s = {k,K2,K3, ..., kn, (>Kn+1)}
(where K is a prime> 2)
to give the worst case time of order of 2n for Approach 3.Not sure yet what is the average expected time complexity.

Find duplicate in array - Time complexity < O(n^2) and constant extra space O(1). (Amazon Interview)

Given below is the problem statement and the solution. I am not able to grasp the logic behind the solution.
Problem Statement:
Given an array nums containing n + 1 integers where each integer is between 1 and n (inclusive), prove that at least one duplicate number must exist. Assume that there is only one duplicate number, find the duplicate one.
Note:
You must not modify the array (assume the array is read only).
You must use only constant, O(1) extra space.
Your runtime complexity should be less than O(n2).
There is only one duplicate number in the array, but it could be repeated more than once.
Sample Input: [3 4 1 4 1]
Output: 1
The Solution for the problem posted on leetcode is:
class Solution(object):
def findDuplicate(self, nums):
"""
:type nums: List[int]
:rtype: int
"""
low = 1
high = len(nums)-1
while low < high:
mid = low+(high-low)/2
count = 0
for i in nums:
if i <= mid:
count+=1
if count <= mid:
low = mid+1
else:
high = mid
return low
Explanation for the above code (as per the author):
This solution is based on binary search.
At first the search space is numbers between 1 to n. Each time I select a number mid (which is the one in the middle) and count all the numbers equal to or less than mid. Then if the count is more than mid, the search space will be [1 mid] otherwise [mid+1 n]. I do this until search space is only one number.
Let's say n=10 and I select mid=5. Then I count all the numbers in the array which are less than equal mid. If the there are more than 5 numbers that are less than 5, then by Pigeonhole Principle (https://en.wikipedia.org/wiki/Pigeonhole_principle) one of them has occurred more than once. So I shrink the search space from [1 10] to [1 5]. Otherwise the duplicate number is in the second half so for the next step the search space would be [6 10].
Doubt: In the above solution, when count <= mid , why are we changing low to low = mid + 1 or otherwise changing high = mid ? What's the logic behind it?
I am unable to understand the logic behind this algorithm
Related Link:
https://discuss.leetcode.com/topic/25580/two-solutions-with-explanation-o-nlog-n-and-o-n-time-o-1-space-without-changing-the-input-array

Well it's a binary search. You are cutting the search space in half and repeating.
Think about it this way: you have a list of 101 items, and you know it contains values 1-100. Take the halfway point, 50. Count how many items are less than or equal to 50. If there are more than 50 items that are less than or equal to 50, then the duplicate is in the range 0-50, otherwise the duplicate is in the range 51-100.
Binary search is simply cutting the range in half. Looking at 0-50, taking midpoint 25 and repeating.
The crucial part of this algorithm which I believe is causing confusion is the for loop. I'll attempt to explain it. Firstly note that there is no usage of indices anywhere in this algorithm - just inspect the code and you'll see that index references do not exist. Secondly, note that the algorithm loops through the entire collection for each iteration of the while loop.
Let me make the following change, then consider the value of inspection_count after every while loop.
inspection_count=0
for i in nums:
inspection_count+=1
if i <= mid:
count+=1
Of course inspection_count will be equal to len(nums). The for loop iterates the entire collection, and for every element checks to see whether it is within the candidate range (of values, not indices).
The duplication test itself is simple and elegant - as others pointed out, this is the pigeonhole principle. Given a collection of n values where every value is in the range {p..q}, if q-p < n then there must be duplicates in the range. Think of some easy cases -
p = 0, q = 5, n = 10
"I have ten values, and every value is between zero and five.
At least one of these values must be duplicated."
We can generalize this, but a more valid and relevant example is
p = 50, q = 99, n = 50
"I have a collection of fifty values, and every value is between fifty and ninety-nine.
There are only forty nine *distinct* values in my collection.
Therefore there is a duplicate."

The logic behind setting low = mid+1 or high = mid is essentially what makes it a solution based on binary search. The search space is divided in half and the while loop is searching only in the lower half (high = mid) or the higher half (low = mid+1) on its next iteration.
So I shrink the search space from [1 10] to [1 5]. Otherwise the duplicate number is in the second half so for the next step the search space would be [6 10].
This is the part of the explanation regarding your question.

lets say you have 10 numbers.
a=[1,2,2,3,4,5,6,7,8,9]
then mid=5
and the number of elements that are less than or equal to 5 are 6 (1,2,2,3,4,5).
now count=6 which is greater than mid. this implies that there is atleast one duplicate in the first half so what the code is doing is making the search space to the first half that is from [1-10] to [1-5] and so on.
Else a duplicate occurs in second half so search space will be [5-10].
Do tell me if you have doubts.

public static void findDuplicateInArrayTest() {
int[] arr = {1, 7, 7, 3, 6, 7, 2, 4};
int dup = findDuplicateInArray(arr, 0, arr.length - 1);
System.out.println("duplicate: " + dup);
}
public static int findDuplicateInArray(int[] arr, int l, int r) {
while (l != r) {
int m = (l + r) / 2;
int count = 0;
for (int i = 0; i < arr.length; i++)
if (arr[i] <= m)
count++;
if (count > m)
r = m;
else
l = m + 1;
}
return l;
}

Number of multiples less than the max number

For the following problem on SingPath:
Given an input of a list of numbers and a high number,
return the number of multiples of each of
those numbers that are less than the maximum number.
For this case the list will contain a maximum of 3 numbers
that are all relatively prime to each
other.
Here is my code:
def countMultiples(l, max_num):
counting_list = []
for i in l:
for j in range(1, max_num):
if (i * j < max_num) and (i * j) not in counting_list:
counting_list.append(i * j)
return len(counting_list)
Although my algorithm works okay, it gets stuck when the maximum number is way too big
>>> countMultiples([3],30)
9 #WORKS GOOD
>>> countMultiples([3,5],100)
46 #WORKS GOOD
>>> countMultiples([13,25],100250)
Line 5: TimeLimitError: Program exceeded run time limit.
How to optimize this code?

3 and 5 have some same multiples, like 15.
You should remove those multiples, and you will get the right answer
Also you should check the inclusion exclusion principle https://en.wikipedia.org/wiki/Inclusion-exclusion_principle#Counting_integers
EDIT:
The problem can be solved in constant time. As previously linked, the solution is in the inclusion - exclusion principle.
Let say you want to get the number of multiples of 3 less than 100, you can do this by dividing floor(100/3), the same applies for 5, floor(100/5).
Now to get the multiplies of 3 and 5 that are less than 100, you would have to add them, and subtract the ones that are multiples of both. In this case, subtracting multiplies of 15.
So the answer for multiples of 3 and 5, that are less than 100 is floor(100/3) + floor(100/5) - floor(100/15).
If you have more than 2 numbers, it gets a bit more complicated, but the same approach applies, for more check https://en.wikipedia.org/wiki/Inclusion-exclusion_principle#Counting_integers
EDIT2:
Also the loop variant can be speed up.
Your current algorithm appends multiple in a list, which is very slow.
You should switch the inner and outer for loop. By doing that you would check if any of the divisors divide the number, and you get the the divisor.
So just adding a boolean variable which tells you if any of your divisors divide the number, and counting the times the variable is true.
So it would like this:
def countMultiples(l, max_num):
nums = 0
for j in range(1, max_num):
isMultiple = False
for i in l:
if (j % i == 0):
isMultiple = True
if (isMultiple == True):
nums += 1
return nums
print countMultiples([13,25],100250)

If the length of the list is all you need, you'd be better off with a tally instead of creating another list.
def countMultiples(l, max_num):
count = 0
counting_list = []
for i in l:
for j in range(1, max_num):
if (i * j < max_num) and (i * j) not in counting_list:
count += 1
return count

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.