Knapsack questions - python

for leetcode 322 coin change, the code is below:
class Solution:
def change(self, amount: int, coins: List[int]) -> int:
dp = [0] * (amount + 1)
dp[0] = 1
for i in range(len(coins)):
for j in range(1, amount + 1):
if coins[i] <= j:
dp[j] = dp[j] + dp[j-coins[i]]
return dp[-1]
for leetcode 1049 last stone weight II, the code is below:
class Solution:
def lastStoneWeightII(self, stones: List[int]) -> int:
total = sum(stones)
Max_weight = int(total/2)
print(Max_weight)
current = (Max_weight+1)*[0]
for stone in stones:
for wgt in range(Max_weight, -1, -1):
if wgt-stone>=0:
current[wgt] = max(stone + current[wgt-stone], current[wgt])
print(stone, wgt, current)
return total-2*current[-1]
I like to understand the logic of knapsack more. My biggest question mark is how do we decide whether the second "for loop" is in ascending order or descending order. For example, the first coin change code [for j in range(1, amount + 1):], it is in ascending order but the second stone weigh code[ for wgt in range(Max_weight, -1, -1):] is in descending order. I kind of understand why it works after tracing the code but is there a more intuitive way of really explaining why we use ascending or descending for loop when dealing with knapsack problems. I really not able to grasp the concept of when to use which one.... Thanks!

They are fundamentally two different problems.
The 322 Coin Change problem is an instance of the Unbounded Knapsack problem -- you have infinite quantities of the items that you want to fit in your knapsack.
The 1049 LastStoneWeightII problem is an instance of the Bounded Knapsack problem -- you have only one instance of each item that you want to fit in your knapsack.
The reason you need to run the for loop in descending order in the second problem is because if you don't you will accidentally add the stones more than once. The optimal solution of a subproblem might have already considered the current stone if you run the loop incrementally. But if you run the loop in reverse, you can be sure that the optimal subproblem solution, that you are using to build the solution considering the current stone, does not consist of the current stone.
If you want to use the incrementing for loop for the bounded knapsack problem, you can either use the 2D array version of the algorithm OR you need to keep track of the subset that is part of the solution to each subproblem.
You can find more discussion here: https://leetcode.com/discuss/study-guide/1200320/Thief-with-a-knapsack-a-series-of-crimes.

Related

Minimum Coin Change Leetcode problem (Dynamic Programming)

Here's the question: https://leetcode.com/problems/coin-change/
I'm having some trouble understanding two different methods of dynamic programming used to solve this problem. I'm currently going through the Grokking Dynamic Programming course from educative.io, and their approach is to use subsets to search for each combination. They go about testing if a coin is viable, if so, then try it in the DFS. If not, skip the coin and go to the next index and try the next coin.
Here's Grokking's approach with memoization:
def coinChange(self, coins: List[int], amount: int) -> int:
def dfs(i, total, memo):
key = (i, total)
if key in memo:
return memo[key]
if total == 0:
return 0
if len(coins) == 0 or i >= len(coins):
return inf
count = inf
if coins[i] <= total:
res = dfs(i, total - coins[i], memo)
if res != inf:
count = res + 1
memo[key] = min(count, dfs(i + 1, total, memo))
return memo[key]
return dfs(0, amount, {}) if dfs(0, amount, {}) != inf else -1
It doesn't do very well on Leetcode; it runs very slowly (but passes, nonetheless). The efficient algorithm that was in the discussions was this:
def coinChange(self, coins: List[int], amount: int) -> int:
#lru_cache(None)
def dp(sum):
if sum == 0: return 0
if sum < 0: return float("inf")
count = float('inf')
for coin in coins:
count = min(count, dp(sum - coin))
return count + 1
return dp(amount) if dp(amount) != float("inf") else -1
Does this second code have the same logic as "testing the subsets of coins?" What's the difference between the two? Is the for-loop a way of testing the different subsets, like with backtracking?
I tested the second algorithm with memoization in a dictionary, like the first, using sum as the key, and it tanked in efficiency. But then I tried using the #lru_cache with the first algorithm, and it didn't help.
Could anyone explain why the second algorithm is so much faster? Is it my memoization that sucks?
Does this second code have the same logic as "testing the subsets of coins?"
If with subset you mean the subset of the coins that is still available for selection, then: no. The second algorithm does not reduce the problem in terms of coins; it reasons that at any time any coin can be selected, irrespective of previous selections. Although this may seem inefficient as it tries to take the same combinations in all possible permutations, this downside is minimised by the effect of memoization.
What's the difference between the two?
The first one takes coins in the order they are given, never going back to take an earlier coin once it has decided to go to the next one. So doing, it tries to reduce the problem in terms of available coins. The second one doesn't care about the order and looks at any permutation, it only reduces the problem in terms of amount.
This first one has a larger memoization collection because the index is part of the key, whereas the second uses a memoization collection that is only keyed by the amount.
The first one makes a recursive call even when no coin is selected (the one at the end of the inner function), since that fits in the logic of reducing the problem to fewer coins. The second one only makes a recursive call when the amount is further reduced.
Is the for-loop a way of testing the different subsets, like with backtracking?
If with subset you mean that the problem is reduced to fewer coins, then no: the second algorithm doesn't attempt to apply that methodology.
The for loop is just a way to consider every coin. It doesn't reduce the problem size in terms of available coins, only in terms of remaining amount.
Could anyone explain why the second algorithm is so much faster?
It is faster because the memoization key is smaller, leading to more hits, leading to fewer recursive calls. You can experiment with this and add global counters that count the number of executions of both inner functions (dfs and dp) and you'll see a dramatic difference there.
Is it my memoization that sucks?
You could say that, but it is too harsh.

I'm not able to memoize this question in leetcode ? What am i doing wrong?

Question link: https://leetcode.com/problems/unique-paths/
Code with memoization but takes the same amount of time :https://leetcode.com/submissions/detail/672801459/
Code without memoization: https://leetcode.com/submissions/detail/672800593/
Code with memoization but takes the same amount of time :https://leetcode.com/submissions/detail/672801459/
I've written code for memoization but something doesn't work please tell me what am i doing wrong.
Memoization is not the problem
I just think that even if the implementation works the way you intend it, it is still too slow, you could have up to one billion path to explore wich could each take more than 100 steps.
Try it with combinatorics
Another approach is to look at the problem from a combinatorics side:
notice how each path could be represented by a list of down or right moves
so the total number of paths is just the number of ways to arrange the down and up moves
From the original question, we can see that there are m-1 moves down and n-1moves up.
In combinatorics, we say that the result is the combination of m-1 in (m-1) + (n-1)
Here is some code that would do that:
class Solution:
def uniquePaths(self, m: int, n: int) -> int:
def factorial(a):
out = 1
for i in range(1, a+1):
out *= i
return out
def combination(a, b):
out = 1
for i in range(b-a+1, b+1):
out *= i
return out // factorial(a)
return combination(m-1, (m-1)+(n-1))

Time complexity for a greedy recursive algorithm

I have coded a greedy recursive algorithm to Find minimum number of coins that make a given change. Now I need to estimate its time complexity. As the algorithm has nested "ifs" depending on the same i (n * n), with the inner block halving the recursive call (log(2)n), I believe the correct answer could be O(n*log(n)), resulting from the following calculation:
n * log2(n) * O(1)
Please, give me your thoughts on whether my analysis is correct and feel free to also suggest improvements on my greedy recursive algorithm.
This is my recursive algorithm:
coins = [1, 5, 10, 21, 25]
coinsArraySize = len(coins)
change = 63
pickedCoins = []
def findMin(change, i, pickedCoins):
if (i>=0):
if (change >= coins[i]):
pickedCoins.append(coins[i])
findMin(change - coins[i], i, pickedCoins)
else:
findMin(change, i-1, pickedCoins)
findMin(change, coinsArraySize-1, pickedCoins)
Each recursive call decreases change by at least 1, and there is no branching (that is, your recursion tree is actually a straight line, so no recursion is actually necessary). Your running time is O(n).
What is n? The runtime depends on both the amount and the specific coins. For example, suppose you have a million coins, 1 through 1,000,000, and try to make change for 1. The code will go a million recursive levels deep before it finally finds the largest coin it can use (1). Same thing in the end if you have only one coin (1) and try to make change for 1,000,000 - then you find the coin at once, but go a million levels deep picking that coin a million times.
Here's a non-recursive version that improves on both of those: use binary search to find the next usable coin, and once a suitable coin is found use it as often as possible.
def makechange(amount, coins):
from bisect import bisect_right
# assumes `coins` is sorted. and that coins[0] > 0
right_bound = len(coins)
result = []
while amount > 0:
# Find largest coin <= amount.
i = bisect_right(coins, amount, 0, right_bound)
if not i:
raise ValueError("don't have a coin <=", amount)
coin = coins[i-1]
# How many of those can we use?
n, amount = divmod(amount, coin)
assert n >= 1
result.extend([coin] * n)
right_bound = i - 1
return result
It still takes O(amount) time if asked to make change for a million with the only coin being 1, but because it has to build a result list with a million copies of 1. If there are a million coins and you ask for change for 1, though, it's O(log2(len(coins))) time. The first could be slashed by changing the output format to a dict, mapping a coin to the number of times that coin is used. Then the first case would be cut to O(1) time.
As is, the time it takes is proportional to the length of the result list, plus some (usually trivial) time for a number of binary searches equal to the number of distinct coins used. So "a bad case" is one where every coin needs to be used; e.g.,
>>> coins = [2**i for i in range(10)]
>>> makechange(sum(coins), coins)
[512, 256, 128, 64, 32, 16, 8, 4, 2, 1]
That's essentially O(n + n log n) where n is len(coins).
Finding an optimal solution
As #Stef noted in a comment, the greedy algorithm doesn't always find a minimal number of coins. That's substantially harder. The usual approach is via dynamic programming, with worst case O(amt * len(coins)) time. But that's also best case: it works "bottom up", finding the smallest number of coins to reach 1, then 2, then 3, then 4, ..., and finally amt.
So I'm going to suggest a different approach, using breadth-first tree search, working down from the initial amount until reaching 0. Worst-case O() behavior is the same, but best-case time is far better. For the comment's:
mincoins(10000, [1, 2000, 3000])
case it looks at less than 20 nodes before finding the optimal 4-coin solution. Because it's a breadth-first search, it knows there's no possible shorter path to the root, so can stop right away then.
For a worst-case example, try
mincoins(1000001, range(2, 200, 2))
All the coins are even numbers, so it's impossible for any collection of them to sum to the odd target. The tree has to be expanded half a million levels deep before it realizes 0 is unreachable. But while the branching factor at high levels is O(len(coins)), the total number of nodes in the entire expanded tree is bounded above by amt + 1 (pigeonhole principle: the dict can't have more than amt + 1 keys, so any number of nodes beyond that are necessarily duplicate targets and so are discarded as soon as they're generated). So, in reality, the tree in this case grows wide very quickly, but then quickly becomes very narrow and very deep.
Also note that this approach makes it very easy to reconstruct the minimal collection of coins that sum to amt.
def mincoins(amt, coins):
from collections import deque
coins = sorted(set(coins)) # increasing, no duplicates
# Map amount to coin that was subtracted to reach it.
a2c = {amt : None}
d = deque([amt])
while d:
x = d.popleft()
for c in coins:
y = x - c
if y < 0:
break # all remaining coins too large
if y in a2c:
continue # already found cheapest way to get y
a2c[y] = c
d.append(y)
if not y: # done!
d.clear()
break
if 0 not in a2c:
raise ValueError("not possible", amt, coins)
picks = []
a = 0
while True:
c = a2c[a]
if c is None:
break
picks.append(c)
a += c
assert a == amt
return sorted(picks)

How to solve subset sum problem with array of size sum+1

Since the problem isn't new and there is a lot of algorithms that solve it I supposed that the question can be duplicating but I didn't find any.
There is a set of an elements. The task is to find is there a subset with the sum equal to some s variable.
Primitive solution is straightforward and can be solved in exponential time. DP recursive approach propose to add memoization to reduce complexity or working with a 2D array (bottom-up).
I found another one in the comment on geeksforgeeks but can't understand how it works.
def is_subset_sum(a, s):
n = len(a)
res = [False] * (s + 1)
res[0] = True
for j in range(n):
i = s
while i >= a[j]:
res[i] = res[i] or res[i - a[j]]
i -= 1
return(res[s])
Could someone please explain the algorithm? What an elements of the array is actually meaning? I'm trying to trace it but can't handle with it.
Putting words to the code: trying each element in the list in turn, set a temporary variable, i, to the target sum. While i is not smaller than the current element, a[j], the sum equal to the current value of i is either (1) already reachable and marked so, or (2) is reachable by adding the current element, a[j], to the sum equal to subtracting the current element from the current value of i, which we may have already marked. We thus enumerate all the possibilities in O(s * n) time and O(s) space. (i might be a poor choice for that variable name since it's probably most commonly seen representing an index rather than a sum. Although, in this case, the sums we are checking are themselves also indexes.)

Find maximum sum of sublist in list of positive integers under O(n^2) of specified length Python 3.5

For one of my programming questions, I am required to define a function that accepts two variables, a list of length l and an integer w. I then have to find the maximum sum of a sublist with length w within the list.
Conditions:
1<=w<=l<=100000
Each element in the list ranges from [1, 100]
Currently, my solution works in O(n^2) (correct me if I'm wrong, code attached below), which the autograder does not accept, since we are required to find an even simpler solution.
My code:
def find_best_location(w, lst):
best = 0
n = 0
while n <= len(lst) - w:
lists = lst[n: n + w]
cur = sum(lists)
best = cur if cur>best else best
n+=1
return best
If anyone is able to find a more efficient solution, please do let me know! Also if I computed my big-O notation wrongly do let me know as well!
Thanks in advance!
1) Find sum current of first w elements, assign it to best.
2) Starting from i = w: current = current + lst[i]-lst[i-w], best = max(best, current).
3) Done.
Your solution is indeed O(n^2) (or O(n*W) if you want a tighter bound)
You can do it in O(n) by creating an aux array sums, where:
sums[0] = l[0]
sums[i] = sums[i-1] + l[i]
Then, by iterating it and checking sums[i] - sums[i-W] you can find your solution in linear time
You can even calculate sums array on the fly to reduce space complexity, but if I were you, I'd start with it, and see if I can upgrade my solution next.

Categories