Trying to understand the time complexity of this dynamic recursive subset sum

Trying to understand the time complexity of this dynamic recursive subset sum - python

# Returns true if there exists a subsequence of `A[0…n]` with the given sum
def subsetSum(A, n, k, lookup):
# return true if the sum becomes 0 (subset found)
if k == 0:
return True
# base case: no items left, or sum becomes negative
if n < 0 or k < 0:
return False
# construct a unique key from dynamic elements of the input
key = (n, k)
# if the subproblem is seen for the first time, solve it and
# store its result in a dictionary
if key not in lookup:
# Case 1. Include the current item `A[n]` in the subset and recur
# for the remaining items `n-1` with the decreased total `k-A[n]`
include = subsetSum(A, n - 1, k - A[n], lookup)
# Case 2. Exclude the current item `A[n]` from the subset and recur for
# the remaining items `n-1`
exclude = subsetSum(A, n - 1, k, lookup)
# assign true if we get subset by including or excluding the current item
lookup[key] = include or exclude
# return solution to the current subproblem
return lookup[key]
if __name__ == '__main__':
# Input: a set of items and a sum
A = [7, 3, 2, 5, 8]
k = 14
# create a dictionary to store solutions to subproblems
lookup = {}
if subsetSum(A, len(A) - 1, k, lookup):
print('Subsequence with the given sum exists')
else:
print('Subsequence with the given sum does not exist')
It is said that the complexity of this algorithm is O(n * sum), but I can't understand how or why;
can someone help me? Could be a wordy explanation or a recurrence relation, anything is fine

The simplest explanation I can give is to realize that when lookup[(n, k)] has a value, it is True or False and indicates whether some subset of A[:n+1] sums to k.
Imagine a naive algorithm that just fills in all the elements of lookup row by row.
lookup[(0, i)] (for 0 ≤ i ≤ total) has just two elements true, i = A[0] and i = 0, and all the other elements are false.
lookup[(1, i)] (for 0 ≤ i ≤ total) is true if lookup[(0, i)] is true or i ≥ A[1] and lookup[(0, i - A[1]) is true. I can reach the sum i either by using A[i] or not, and I've already calculated both of those.
...
lookup[(r, i)] (for 0 ≤ i ≤ total) is true if lookup[(r - 1, i)] is true or i ≥ A[r] and lookup[(r - 1, i - A[r]) is true.
Filling in this table this way, it is clear that we can completely fill the lookup table for rows 0 ≤ row < len(A) in time len(A) * total since filling in each element in linear. And our final answer is just checking if (len(A) - 1, sum) True in the table.
Your program is doing the exact same thing, but calculating the value of entries of lookup as they are needed.

Sorry for submitting two answers. I think I came up with a slightly simpler explanation.
Take your code in imagine putting the three lines inside if key not in lookup: into a separate function, calculateLookup(A, n, k, lookup). I'm going to call "the cost of calling calculateLookup for n and k for a specific value of n and k to be the total time spent in the call to calculateLookup(A, n, k, loopup), but excluding any recursive calls to calculateLookup.
The key insight is that as defined above, the cost of calling calculateLookup() for any n and k is O(1). Since we are excluding recursive calls in the cost, and there are no for loops, the cost of calculateLookup is the cost of just executing a few tests.
The entire algorithm does a fixed amount of work, calls calculateLookup, and then a small amount of work. Hence the amount of time spent in our code is the same as asking how many times do we call calculateLookup?
Now we're back to previous answer. Because of the lookup table, every call to calculateLookup is called with a different value for (n, k). We also know that we check the bounds of n and k before each call to calculateLookup so 1 ≤ k ≤ sum and 0 ≤ n ≤ len(A). So calculateLookup is called at most (len(A) * sum) times.
In general, for these algorithms that use memoization/cacheing, the easiest thing to do is to separately calculate and then sum:
How long things take assuming all values you need are cached.
How long it takes to fill the cache.
The algorithm you presented is just filling up the lookup cache. It's doing it in an unusual order, and its not filling every entry in the table, but that's all its doing.
The code would be slightly faster with
lookup[key] = subsetSum(A, n - 1, k - A[n], lookup) or subsetSum(A, n - 1, k, lookup)
Doesn't change the O() of the code in the worst case, but can avoid some unnecessary calculations.

Related

Big O of backtracking solution counts permutations with range

I have a problem and I've been struggling with my solution time and space complexity:
Given an array of integers (possible duplicates) A and min, low, high are integers.
Find the total number of combinations of items in A that:
low <= A[i] <= high
Each combination has at least min numbers.
Numbers in one combination can be duplicates as they're considered unique in A but combinations can not be duplicates. E.g.: [1,1,2] -> combinations: [1,1],[1,2],[1,1,2] are ok but [1,1],[1,1], [1,2], [2,1] ... are not.
Example: A=[4, 6, 3, 13, 5, 10], min = 2, low = 3, high = 5
There are 4 ways to combine valid integers in A: [4,3],[4,5],[4,3,5],[3,5]
Here's my solution and it works:
class Solution:
def __init__(self):
pass
def get_result(self, arr, min_size, low, high):
return self._count_ways(arr, min_size, low, high, 0, 0)
def _count_ways(self, arr, min_size, low, high, idx, comb_size):
if idx == len(arr):
return 0
count = 0
for i in range(idx, len(arr)):
if arr[i] >= low and arr[i] <= high:
comb_size += 1
if comb_size >= min_size:
count += 1
count += self._count_ways(arr, min_size, low, high, i + 1, comb_size)
comb_size -= 1
return count
I use backtracking so:
Time: O(n!) because for every single integer, I check with each and every single remaining one in worst case - when all integers can form combinations.
Space: O(n) for at most I need n calls on the call stack and I only use 2 variables to keep track of my combinations.
Is my analysis correct?
Also, a bit out of the scope but: Should I do some kind of memoization to improve it?

If I understand your requirements correctly, your algorithm is far too complicated. You can do it as follows:
Compute array B containing all elements in A between low and high.
Return sum of Choose(B.length, k) for k = min .. B.length, where Choose(n,k) is n(n-1)..(n-k+1)/k!.
Time and space complexities are O(n) if you use memoization to compute the numerators/denominators of the Choose function (e.g. if you have already computed 5*4*3, you only need one multiplication to compute 5*4*3*2 etc.).
In your example, you would get B = [4, 3, 5], so B.length = 3, and the result is
Choose(3, 2) + Choose(3, 3)
= (3 * 2)/(2 * 1) + (3 * 2 * 1)/(3 * 2 * 1)
= 3 + 1
= 4

Your analysis of the time complexity isn't quite right.
I understand where you're getting O(n!): the for i in range(idx, len(arr)): loop decreases in length with every recursive call, so it seems like you're doing n*(n-1)*(n-2)*....
However, the recursive calls from a loop of length m do not always contain a loop of size m-1. Suppose your outermost call has 3 elements. The loop iterates through 3 possible values, each spawning a new call. The first such call will have a loop that iterates over 2 values, but the next call iterates over only 1 value, and the last immediately hits your base case and stops. So instead of 3*2*1=((1+1)+(1+1)+(1+1)), you get ((1+0)+1+0).
A call to _count_ways with an array of size n takes twice as long as a call with size n-1. To see this, consider the first branch in the call of size n which is to choose the first element or not. First we choose that first element, which leads to a recursive call with size n-1. Second we do not choose that first element, which gives us n-1 elements left to iterate over, so it's as if we had a second recursive call with size n-1.
Each increase in n increase time complexity by a factor of 2, so the time complexity of your solution is O(2^n). This makes sense: you're checking every combination, and there are 2^n combinations in a set of size n.
However, as you're only trying to count the combinations and not do something with them, this is highly inefficient. See #Mo B.'s answer for a better solution.

Maximum non-contiguous sum of values in list less than or equal to k

I have a list of values [6,1,1,5,2] and a value k = 10. I want to find the maximum sum of values from the list that is less than or equal to k, return the value and the numbers used. In this case the output would be: 10, [6,1,1,2].
I was using this code from GeeksForGeeks as an example but it doesn't work correctly (in this case, the code's result is 9).
The values do not need to be contiguous - they can be in any order.
def maxsum(arr, n, sum):
curr_sum = arr[0]
max_sum = 0
start = 0;
for i in range(1, n):
if (curr_sum <= sum):
max_sum = max(max_sum, curr_sum)
while (curr_sum + arr[i] > sum and start < i):
curr_sum -= arr[start]
start += 1
curr_sum += arr[i]
if (curr_sum <= sum):
max_sum = max(max_sum, curr_sum)
return max_sum
if __name__ == '__main__':
arr = [6, 1, 1, 5, 2]
n = len(arr)
sum = 10
print(maxsum(arr, n, sum))
I also haven't figured out how to output the values that are used for the sum as a list.

This problem is at least as hard as the well-studied subset sum problem, which is NP-complete. In particular, any algorithm which solves your problem can be used to solve the subset sum problem, by finding the maximum sum <= k and then outputting True if the sum equals k, or False if the sum is less than k.
This means your problem is NP-hard, and there is no known algorithm which solves it in polynomial time. Your algorithm's running time is linear in the length of the input array, so it cannot correctly solve the problem, and no similar algorithm can correctly solve the problem.
One approach that can work is a backtracking search - for each element, try including it in the sum, then backtrack and try not including it in the sum. This will take exponential time in the length of the input array.
If your array elements are always integers, another option is dynamic programming; there is a standard dynamic programming algorithm which solves the integer subset sum problem in pseudopolynomial time, which could easily be adapted to solve your form of the problem.

Here's a solution using itertools.combinations. It's fast enough for small lists, but slows down significantly if you have a large sum and large list of values.
from itertools import combinations
def find_combo(values, k):
for num_sum in range(k, 0, -1):
for quant in range(1, len(values) + 1):
for combo in combinations(values, quant):
if sum(combo) == num_sum:
return combo
values = [6, 1, 1, 5, 2]
k = 10
answer = find_combo(values, k)
print(answer, sum(answer))
This solution works for any values in a list and any k, as long as the number of values needed in the solution sum doesn't become large.
The solution presented by user10987432 has a flaw that this function avoids, which is that it always accepts values that keep the sum below k. With that solution, the values are ordered from largest to smallest and then iterated through and added to the solution if it doesn't bring the sum higher than k. However a simple example shows this to be inaccurate:
values = [7, 5, 4, 1] k = 10
In that solution, the sum would begin at 0, then go up to 7 with the first item, and finish at 8 after reaching the last index. The correct solution, however, is 5 + 4 + 1 = 10.

How to generate natural products in order?

You want to have a list of the ordered products n x m so that both n and m are natural numbers and 1 < (n x m) < upper_limit, say uper_limit = 100. Also both n and m cannot be bigger than the square root of the upper limit (therefore n <= 10 and m <= 10).
The most straightforward thing to do would be to generate all the products with a list comprehension and then sort the result.
sorted(n*m for n in range(1,10) for m in range(1,n))
However when upper_limit becomes very big then this is not very efficient, especially if the objective is to found only one number given certain criteria (ex. find the max product such that ... -> I would want to generate the products in descending order, test them and stop the whole process as soon as I find the first one that respects the criteria).
So, how to generate this products in order?
The first thing I have done was to start from the upper_limit and go backwards one by one, making a double test:
- checking if the number can be a product of n and m
- checking for the criteria
Again, this is not very efficient ...
Any algorithm that solves this problem?

I found a slightly more efficient solution to this problem.
For a and b being natural numbers:
S = a + b
D = abs(a - b)
If S is constant, the smaller D is, the bigger a*b is.
For each S (taken in decreasing order) it is therefore possible to iterate through all the possible tuples (a, b) with increasing D.
First I plug the external condition and if the product ab respects the condition I then iterate through other (a,b) tuples with smaller decreasing S and smaller increasing D to check if I find other numbers that respect the same condition but have a bigger ab. I repeat the iteration until I find a number with D == 0 or 1 (because in that case there cannot be tuples with smaller S that have a higher product)

The following code will check all the possible combinations without repetition and will stop when the condition is met. In this code if the break is executed in the inner loop, the break statement in the outer loop is executed as well, otherwise continue statement is executed.
from math import sqrt
n = m = round(sqrt(int(input("Enter upper limit"))))
for i in range(n, 0, -1):
for j in range(i - 1, 0, -1):
if * required condition:
n = i
m = j
break
else:
continue
break

Solution Performance (DP, hashtable) for Partition Equal Subset Sum

I know there are some related questions have already been asked in stackoverflow. However, this question is more related to the performance difference among 3 approaches.
The question is: Given a non-empty array containing only positive integers, find if the array can be partitioned into two subsets such that the sum of elements in both subsets is equal. https://leetcode.com/problems/partition-equal-subset-sum/
i.e [1, 5, 11, 5] = True, [1, 5, 9] = False
By solving this problem, I have tried 3 approaches:
Approach 1: Dynamic Programming. Top to Bottom Recursion + memorisation (Result: Time Limit Exceeded):
def canPartition(nums):
total, n = sum(nums), len(nums)
if total & 1 == 1: return False
half = total >> 1
mem = [[0 for _ in range(half)] for _ in range(n)]
def dp(n, half, mem):
if half == 0: return True
if n == -1: return False
if mem[n - 1][half - 1]: return mem[n - 1][half - 1]
mem[n - 1][half - 1] = dp(n - 1, half, mem) or dp(n - 1, half - nums[n - 1], mem)
return mem[n - 1][half - 1]
return dp(n - 1, half, mem)
Approach 2: Dynamic Programming. Bottom up. (Result: 2208 ms Accepted):
def canPartition(self, nums):
total, n = sum(nums), len(nums)
if total & 1 == 1: return False
half = total >> 1
matrix = [[0 for _ in range(half + 1)] for _ in range(n)]
for i in range(n):
for j in range(1, half + 1):
if i == 0:
if j >= nums[i]: matrix[i][j] = nums[i]
else: matrix[i][j] = 0
else:
if j >= nums[i]:
matrix[i][j] = max(matrix[i - 1][j], nums[i] + matrix[i - 1][j - nums[i]])
else: matrix[i][j] = matrix[i - 1][j]
if matrix[i][j] == half: return True
return False
Approach 3: HashTable (Dict). Result (172 ms Accepted):
def canPartition(self, nums):
total = sum(nums)
if total & 1 == 0:
half = total >> 1
cur = {0}
for number in nums:
cur |= { number + x for x in cur} # update the dictionary (hashtable) if key doesn't exist
if half in cur: return True
return False
I really don't understand two things for above 3 approaches regarding time complexity:
I would expect the approach 1 and the approach 2 should have the same result. Both is using a table (matrix) to record the calculated state, but why bottom up approach is quicker ?
I don't know why approach 3 is so much quicker compared to the others. Note: as a glance, approach 3 seems to be 2 to Nth Power approach, but it is using dictionary to discard duplicate value, so the time complexity should be T(n * half).

My guess about the difference between approach 1 and the others is that due to the recursion, approach 1 needs to generate significantly more stack frames, which cost more in system resources than just allocating the matrix and iterating over a conditional. But if I were you, I would try to use some kind of process and memory analyzer to better determine and confirm what's happening. Approach 1 assigns a matrix dependent on the range but the algorithm actually limits the number of iterations to potentially far less since the next function call jumps to a sum subtracted by the array element, rather than combing all possibilities.
Approach 3 depends solely on the number of input elements and the number of sums that can be generated. In each iteration, it adds the current number in the input to all previously achievable numbers, adding only new ones to that list. Given the list [50000, 50000, 50000], for example, approach 3 would iterate over at most three sums: 50000, 100000, and 150000. But since it depends on the range, approach 2 would iterate at least 75000 * 3 times!
Given the list [50000, 50000, 50000], approaches 1, 2 and 3 generate the following numbers of iterations: 15, 225000, and 6.

You are right, Approach 1) and 3) have the same time complexity, approach 2 is the DP version of the knapsack(0/1), approach 1 is the branch and bound version. You can improve approach one by pruning the tree through any of the knapsack heuristics but the optimization has to be strict e.g. if the existing sum and the sum of remaining elements at level K is < half then skip it. This way approach 1) can have better computational complexity than 3).
Why are approach 1) and 3) having a different running time,
[To some extent]
That has more to do with the implementation of dictionaries in python. The dictionaries are natively implemented by the Python interpreter, any operation on them is simply going to be faster than anything that needs to be interpreted first and more often. Also, function calls have higher overhead in python, they are objects. So calling one is not a simple bump up stack and jmp/call operation.
[To a large extent]
Another aspect to mull on is the time complexity of the third approach. For approach 3, the only way time complexity can be exponential is if each iteration results in the insertion of as many elements as are there in the dictionary for the current iteration.
cur |= { number + x for x in cur}
The above line should double |cur|.
I think that it is possible for a series like,
s = {k,K2,K3, ..., kn, (>Kn+1)}
(where K is a prime> 2)
to give the worst case time of order of 2n for Approach 3.Not sure yet what is the average expected time complexity.

Finding Missing Element in an Array

I have an interesting problem that given two sorted arrays:
a with n elements , b with n-1 elements.
b has all the elements of a except one element is missing.
How to find that element in O(log n) time?
I have tried this code:
def lostElements2(a, b):
if len(a)<len(b):
a, b = b, a
l, r = 0, len(a)-1
while l<r:
m = l + (r-l)//2
if a[m]==b[m]:
l = m+1
else:
r = m - 1
return a[r]
print(lostElements2([-1,0,4,5,7,9], [-1,0,4,5,9]))
I am not getting what should I return in the function, should it be a[l], a[r]?
I am getting how the logic inside the function should be: if the mid values of both arrays match, it means, b till the mid point is the same as a, and hence the missing element must be on the right of mid.
But am not able to create a final solution, when should the loop stop and what should be returned? How will it guarantee that a[l] or a[r] is indeed the missing element?

The point of l and r should be that l is always a position where the lists are equal, while r is always a position where they differ. Ie.
a[l]==b[l] and a[r]!=b[r]
The only mistake in the code is to update r to m-1 instead of m. If we know that a[m]!=b[m], we can safely set r=m. But setting it to m-1risks getting a[r]==b[r], which breaks the algorithm.
def lostElements2(a, b):
if len(a) < len(b):
a, b = b, a
if a[0] != b[0]:
return a[0]
l, r = 0, len(a)-1
while l < r:
m = l + (r-l)//2
if a[m] == b[m]:
l = m+1
else:
r = m # The only change
return a[r]
(As #btilly points out, this algorithm fails if we allow for repeated values.)
edit from #btilly
To fix that potential flaw, if the values are equal, we search for the range with the same value. To do that we walk forward in steps of size 1, 2, 4, 8 and so on until the value switches, then do a binary search. And walk backwards the same. Now look for a difference at each edge.
The effort required for that search is O(log(k)) where k is the length of the repeated value. So we are now replacing O(log(n)) lookups with searches. If there is an upper bound K on the length of that search, that makes the overall running time. O(log(n)log(K)). That makes the worst case running time O(log(n)^2). If K is close to sqrt(n), it is easy to actually hit that worst case.
I claimed in a comment that if at most K elements are repeated more than K times then the running time is O(log(n)log(K)). On further analysis, that claim is wrong. If K = log(n) and the log(n) runs of length sqrt(n) are placed to hit all the choices of the search, then you get running time O(log(n)^2) and not O(log(n)log(log(n))).
However if at most log(K) elements are repeated more than K times, then you DO get a running time of O(log(n)log(K)). Which should be good enough for most cases. :-)

The principle of this problem is simple, the details are hard.
You have arranged that array a is the longer one. Good, that simplifies life. Now you need to return the value of a at the first position where the value of a differs from the value of b.
Now you need to be sure to deal with the following edge cases.
The differing value is the last (ie at a position where only array a has a value.
The differing value is the very first. (Binary search algorithms are easy to screw up for this case.
There is a run the same. That is a = [1, 1, 2, 2, 2, 2, 3] while b = [1, 2, 2, 2, 2, 3] - when you land in the middle the fact that the values match can mislead you!
Good luck!

Your code is not handling the case where the missing element is the index m itself. Your if/else clause that follows will always move the bounds of where the missing element can be to not include m.
You could fix this by including an additional check:
if a[m]==b[m]:
l = m+1
elif m==0 or a[m-1]==b[m-1]:
return a[m]
else:
r = m - 1
An alternative would be to store the last value of m:
last_m = 0
...
else:
last_m = m
r = m - 1
...
return a[last_m]
Which would cause it to return the last time a mismatch was detected.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.