simplify code in depth first search function call - python

I am one of the many hanging around stack overflow for knowledge and help, especially those who are out of school already. Much of my CS knowledge is learnt from this excellent web. Sometimes my question can get quite silly. Please forgive me as a newbie.
I am working on the Largest Divisible Subset problem on leetcode. There are many good solutions there, but I try to solve with my own thought first. My strategy is turning this problem into a combination problem and find the largest one who meets the divisible requirement.I use depth-first search method and isDivisible to create such a combinations. All the combinations I found meet the divisible requirement.
Here is how I would code to conduct all possible combinations of a given sequence.
def combinations(nums, path, res):
if not nums:
res.append(path)
for i in range(len(nums)):
combinations(nums[i+1:], path+[nums[i]], res)
Following is my code to create a combination of all possible divisible subsets. The code is almost exactly the same as the above code, except that that I add isDivisible to determine whether or not to add the nums[i] to the path.
def isDivisible(num, list_):
return all([num%item==0 or item%num==0 for item in list_])
def dfs(nums, path, res):
if not nums:
res.append(path)
return
for i in range(len(nums)):
# if not path or isDivisible(nums[i], path):
# path = path + [nums[i]]
# dfs(nums[i+1:], path , res)
dfs(nums[i+1:], path + ([nums[i]] if not path or isDivisible(nums[i], path) else []), res)
path = []
res = []
dfs(nums, [], res)
return sorted(res, key=len)
It works fine (almost got accepted but exceeded the time limit for large input) because of the performance of dfs. My question here is how I can simplify the last line of code in dfs by moving ([nums[i]] if not path or isDivisible(nums[i], path) else []) out of the function call, which is too bulky inside a function call. I tried to use the three lines in the comment to replace the last line of code, but it failed because path will propagate every nums[i] who meets the condition to next dfs. Could you please teach me to simplify the code and give some general suggestions. Thank you very much.

Not sure about your method, check first to see if it would get accepted.
Here is a bit simpler to implement solution (which I guess it would be one of Stefan's suggested methods):
class Solution:
def largestDivisibleSubset(self, nums):
hashset = {-1: set()}
for num in sorted(nums):
hashset[num] = max((hashset[k] for k in hashset if num % k == 0), key=len) | {num}
return list(max(hashset.values(), key=len))
Here is LeetCode's DP solution with comments:
class Solution(object):
def largestDivisibleSubset(self, nums):
"""
:type nums: List[int]
:rtype: List[int]
"""
if len(nums) == 0:
return []
# important step !
nums.sort()
# The container that keep the size of the largest divisible subset that ends with X_i
# dp[i] corresponds to len(EDS(X_i))
dp = [0] * (len(nums))
""" Build the dynamic programming matrix/vector """
for i, num in enumerate(nums):
maxSubsetSize = 0
for k in range(0, i):
if nums[i] % nums[k] == 0:
maxSubsetSize = max(maxSubsetSize, dp[k])
maxSubsetSize += 1
dp[i] = maxSubsetSize
""" Find both the size of largest divisible set and its index """
maxSize, maxSizeIndex = max([(v, i) for i, v in enumerate(dp)])
ret = []
""" Reconstruct the largest divisible subset """
# currSize: the size of the current subset
# currTail: the last element in the current subset
currSize, currTail = maxSize, nums[maxSizeIndex]
for i in range(maxSizeIndex, -1, -1):
if currSize == dp[i] and currTail % nums[i] == 0:
ret.append(nums[i])
currSize -= 1
currTail = nums[i]
return reversed(ret)
I guess maybe this LeetCode solution would be a bit closer to your method, is a recursion with memoization.
class Solution:
def largestDivisibleSubset(self, nums: List[int]) -> List[int]:
def EDS(i):
""" recursion with memoization """
if i in memo:
return memo[i]
tail = nums[i]
maxSubset = []
# The value of EDS(i) depends on it previous elements
for p in range(0, i):
if tail % nums[p] == 0:
subset = EDS(p)
if len(maxSubset) < len(subset):
maxSubset = subset
# extend the found max subset with the current tail.
maxSubset = maxSubset.copy()
maxSubset.append(tail)
# memorize the intermediate solutions for reuse.
memo[i] = maxSubset
return maxSubset
# test case with empty set
if len(nums) == 0: return []
nums.sort()
memo = {}
# Find the largest divisible subset
return max([EDS(i) for i in range(len(nums))], key=len)
References
For additional details, you can see the Discussion Board. There are plenty of accepted solutions with a variety of languages and explanations, efficient algorithms, as well as asymptotic time/space complexity analysis1, 2 in there.

Related

Given an array nums of distinct integers, return all the possible permutations. You can return the answer in any order?

I am working on LeetCode problem 46. Permutations:
Given an array nums of distinct integers, return all the possible permutations. You can return the answer in any order.
I thought to solve this using backtracking. My idea is to image this problem as a binary tree and step down a path. When I get to a leaf I pop the visit array and restore to a new root number.
My code below:
class Solution:
def permute(self, nums: List[int]) -> List[List[int]]:
perms = []
def dfs(curr, l):
if len(nums) == len(curr):
perms.append([int(i) for i in curr])
return
visit = []
for t in nums:
if str(t) not in curr:
visit.append(t)
dfs(curr + str(l), t)
visit.pop()
return
dfs('', nums[0])
return perms
I get wrong output for the following test case:
nums = [1,2,3]
The expected output is:
[[1,2,3],[1,3,2],[2,1,3],[2,3,1],[3,1,2],[3,2,1]]
But my code outputs:
[[1,1,2],[1,1,2],[1,1,3],[1,1,3],[1,2,2],[1,2,3],[1,3,2],[1,3,3]]
I don't understand why my output has lists with duplicate ones, even though I check that str(t) not in curr to avoid that duplicate use of a number.
What am I missing?
Here's the backtracking version:
def f(lst, curr = []):
if len(lst) == len(curr):
return [tuple(curr)]
result = []
for e in lst:
if e not in curr:
curr.append(e)
result.extend(f(lst, curr))
curr.pop()
return result
lst = [1, 2, 3, 4]
print(f(lst))
The main reason you have duplicate 1 in your output tuples (in the example) is that in the recursive call you are not appending the right number to curr. l is already in curr once you are in a recursive call! It should be dfs(curr + str(t), t) since you have verified that t is not in curr, it should be that number that is appended to it.
And when you make this change, there is no more need for the l parameter in dfs, as l is no longer used.
There are a few other issues however:
return perms has a wrong indentation (probably a typo in your question?)
The code assumes that numbers are always single characters when converted to string, but the code challenge indicates that a number may be 10 or may be negative, and so the way you check that a number is already in curr will not be reliable. For instance, if you first visit "10" and then want to visit "1", it will not work because if str(t) not in curr: will not be true.
Secondly, [int(i) for i in curr] will extract only one-digit numbers, and will fail if you had appended a negative number in curr, as then int('-') will raise an exception.
Not a problem, but the visited list is useless in your code. It is never used to verify anything. And also the return as last statement in dfs is not really needed, as this is the default behaviour.
I would suggest to make curr a list of numbers instead of a string.
Here is your code with the above changes applied:
class Solution:
def permute(self, nums: List[int]) -> List[List[int]]:
perms = []
def dfs(curr):
if len(nums) == len(curr):
perms.append(curr)
return
for t in nums:
if t not in curr:
dfs(curr + [t])
dfs([])
return perms
It would be nice to use a generator here:
class Solution:
def permute(self, nums: List[int]) -> List[List[int]]:
def dfs(curr):
if len(nums) == len(curr):
yield curr
return
for t in nums:
if t not in curr:
yield from dfs(curr + [t])
return list(dfs([]))
Finally, there is of course ... itertools:
import itertools
class Solution:
def permute(self, nums: List[int]) -> List[List[int]]:
return list(itertools.permutations(nums))

Memoized solution to Combination IV on Leetcode gives TLE when an array is used for caching

While trying to solve Combination IV on Leetcode, I came up with this memoized solution:
def recurse(nums, target, dp):
if dp[target]!=0:
return dp[target]
if target==0:
return dp[0]
for n in nums:
if n<=target:
dp[target] += recurse(nums, target-n, dp)
return dp[target]
class Solution:
def combinationSum4(self, nums: List[int], target: int) -> int:
dp = [0]*(target+1)
dp[0] = 1
return recurse(nums, target, dp)
But this gives me a Time Limit Exceeded error.
Another memoized solution, that uses a dictionary to cache values instead of a dp array, runs fine and does not exceed time limit. The solution is as follows:
class Solution:
def combinationSum4(self, nums: List[int], target: int) -> int:
memo = {}
def dfs(nums, t, memo):
if t in memo:
return memo[t]
if t == 0:
return 1
if t < 0:
return 0
res = 0
for i in nums:
res += dfs(nums, t-i, memo)
memo[t] = res
return res
return (dfs(nums, target, memo))
Why does using a dict instead of an array improve runtime? It is not like we are iterating through the array or dict, we are only using them to store and access values.
EDIT: The test case on which my code crashed is as follows:
nums = [10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,250,260,270,280,290,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,470,480,490,500,510,520,530,540,550,560,570,580,590,600,610,620,630,640,650,660,670,680,690,700,710,720,730,740,750,760,770,780,790,800,810,820,830,840,850,860,870,880,890,900,910,920,930,940,950,960,970,980,990,111]
target = 999
The two versions of the code are not the same. In the list version, you keep recursing if your "cached" value is 0. In the dict version, you keep recursing if the current key is not in the cache. This makes a difference when the result is 0. For example, if you try an example with nums=[2, 4, 6, 8, 10] and total=1001, there is no useful caching done in the list version (because every result is 0).
You can improve your list version by initializing every entry to None rather than 0, and using None as a sentinel value to determine if the result isn't cached.
It's also easier to drop the idea of a cache, and use a dynamic programming table directly. For example:
def ways(total, nums):
r = [1] + [0] * total
for i in range(1, total+1):
r[i] = sum(r[i-n] for n in nums if i-n>=0)
return r[total]
This obviously runs in O(total * len(nums)) time.
It's not necessary here (since total is at most 1000 in the question), but you can in principle keep only the last N rows of the table as you iterate (where N is the max of the nums). That reduces the space used from O(total) to O(max(nums)).

Contains Duplicate II Exceeded time solution

I've written a Python solution for the Contains Duplicate II leetcode problem, but when I test it I get a "time limit exceeded" message. However, I'm confused because I thought my solution is O(n) runtime. Could someone please explain?
def containsNearbyDuplicate(self, nums, k):
"""
:type nums: List[int]
:type k: int
:rtype: bool
"""
for i in range(len(nums)):
lookingfor = nums[i]
rest = nums[i+1: ]
if lookingfor in rest:
secondindex = rest.index(lookingfor)+i+1
if abs(i - secondindex) <= k:
return True
return False
In general, using in to search for an element in a list takes linear time.
Applying this to your code, we can observe that the search in rest takes O(len(nums)) times, which you repeat O(len(nums)) times. That leads to a quadratic runtime, causing your submission to TLE.
To get a linear runtime, use a dictionary:
class Solution:
def containsNearbyDuplicate(self, nums, k):
seen = {}
for index, element in enumerate(nums):
if element in seen and index - seen[element] <= k:
return True
seen[element] = index
return False

Python Code Optimization Problem (Lintcode Problem 1886 Moving Targets)

I have been working on a problem on https://www.lintcode.com/ and I have ran into a problem while doing one of the questions. The problem requires me to write a function with two parameters. A list of nums and a target num. You have to take all instances of the target from the list and move them to the front of the original list and the function cannot have a return value. The length of the list is between 1 and 1000000. You also have to do it within a time limit, which is around 400 milliseconds. I can solve the problem, I can't pass the last test case where the length of the list is 1000000. Does anyone know how I can make my code faster?
Original Problem Description for anyone who still isn't clear:
Current Code:
def MoveTarget(nums, target):
if len(set(nums)) == 1:
return nums
index = [i for i in range(len(nums)) if nums[i] == target]
for i in index:
nums.insert(0, nums.pop(i))
It works if you do:
def MoveTarget(nums, target):
count = 0
left, right = len(nums) - 1, len(nums) - 1
while left >= 0:
if nums[left] != target:
nums[right] = nums[left]
right -= 1
else:
count += 1
left -= 1
for i in range(count):
nums[i] = target
but I was wondering if there was another, less complicated way.
Here is a simple and relatively efficient implementation:
def MoveTarget(nums, target):
n = nums.count(target)
nums[:] = [target] * n + [e for e in nums if e != target]
It creates a new list with the n target values in the front and append all the other values that are not target. The input list nums is mutated thanks to the expression nums[:] = ....
The solution run in linear time as opposed to the previously proposed implementations (running in quadratic time). Indeed, insert runs in linear time in CPython.
Your code uses 2 loops. One in:
index = [i for i in range(len(nums)) if nums[i] == target]
And one in:
for i in index:
nums.insert(0, nums.pop(i))
Instead, you can combine finding the target and moving it to the front of array with only one loop, which will greatly reduce the execution time:
def MoveTarget(nums, target):
if len(set(nums)) == 1:
return nums
for num in nums:
if num == target:
nums.insert(0, nums.pop(num))

Why is recursion not the same as dynamic programming in some cases?

I am working on dynamic programming and I came across a problem to find unique paths to reach a point 'b' from point 'a' in a m x n grid. You can only move right or down. My question is why is the following recursive approach much slower than the dynamic one described later?
Recursion:
def upaths(m,n):
count = [0]
upathsR(m,n, count)
return count[0]
def upathsR(m,n,c):
# Increase count when you reach a unique path
# When you reach 1 there is only one way to get to an end (optimization)
if m==1 or n==1:
c[0]+=1
return
if m > 1:
upathsR(m-1,n,c)
if n > 1:
upathsR(m,n-1,c)
Dynamic:
def upaths(m, n):
count = [[1 for x in range(n)] for x in range(m)]
for i in range(1, m):
for j in range(1, n):
count[i][j] = count[i][j-1] + count[i-1][j]
return count[-1][-1]
Usually recursion has repetitive calls that can be memoized but in this case I see unique calls, Even so Recursion is much slower. Can someone explain why..
Following suggestions from the answers, it worked faster. And the calls aren't unique.
New Recursive Approach:
def upaths(m,n):
d = dict()
return upathsR(m,n,d)
def upathsR(m,n,d):
if (m,n) in d:
return d[(m,n)]
if m==1 or n==1:
return 1
d[(m,n)] = upathsR(m-1,n,d)+upathsR(m,n-1,d)
return d[(m,n)]

Categories