Longest non-decreasing subsequence with minimal sum - python

I am trying to find topic algorithm and am stuck. Basically, I adopted the code given in zzz's answer here, which is Longest Increasing Subsequence algorithm, to get Longest Non-decreasing Subsequence. What I aim to find is LNDS that has a minimal sum (MSLNDS) and don't know do I have one. But as far as I can tell, original LIS algorithm as presented on wikipedia does locate minimal sum LIS. Docstring of its code says that LIS algorithm guarantees that if multiple increasing subsequences exist, the one that ends with the smallest value is preferred, and if multiple occurrences of that value can end the sequence, then the earliest occurrence is preferred. Don't know what earliest occurrence means, but would love not to be in the position to generate all LNDS to find my MSLNDS. It seems to me that clever transformation given by templatetypedef may be used to show that unique MSLIS transforms to MSLNDS, but dont have the proof. So,
a) Will LIS algorithm as given on wikipedia always output minimal sum LIS?
b) If LIS algorithm is adopted this way, will LNDS algorithm retain this property?
def NDS(X):
n = len(X)
X = [0.0] + X
M = [None]*(n+1)
P = [None]*(n+1)
L = 0
for i in range(1,n+1):
#########################################
# for LIS algorithm, this line would be
# if L == 0 or X[M[1]] >= X[i]:
#########################################
if L == 0 or X[M[1]] > X[i]:
j = 0
else:
lo = 1
hi = L+1
while lo < hi - 1:
mid = (lo + hi)//2
#########################################
# for LIS algorithm, this line would be
# if X[M[mid]] < X[i]:
#########################################
if X[M[mid]] <= X[i]:
lo = mid
else:
hi = mid
j = lo
P[i] = M[j]
if j == L or X[i] < X[M[j+1]]:
M[j+1] = i
L = max(L,j+1)
output = []
pos = M[L]
while L > 0:
output.append(X[pos])
pos = P[pos]
L -= 1
output.reverse()
return output

Related

Leetcode question '3Sum' algorithm exceeds time limit, looking for improvement

Given an array nums of n integers, are there elements a, b, c in nums such that a + b + c = 0? Find all unique triplets in the array which gives the sum of zero.
class Solution:
def threeSum(self, nums):
data = []
i = j = k =0
length = len(nums)
for i in range(length):
for j in range(length):
if j == i:
continue
for k in range(length):
if k == j or k == i:
continue
sorted_num = sorted([nums[i],nums[j],nums[k]])
if nums[i]+nums[j]+nums[k] == 0 and sorted_num not in data:
data.append(sorted_num)
return data
My soulution is working well but it appears that it may be too slow.
Is there a way to improve my codes without changing it significantly?
This is a O(n^2) solution with some optimization tricks:
import itertools
class Solution:
def findsum(self, lookup: dict, target: int):
for u in lookup:
v = target - u
# reduce duplication, we may enforce v <= u
try:
m = lookup[v]
if u != v or m > 1:
yield u, v
except KeyError:
pass
def threeSum(self, nums: List[int]) -> List[List[int]]:
lookup = {}
triplets = set()
for x in nums:
for y, z in self.findsum(lookup, -x):
triplets.add(tuple(sorted([x, y, z])))
lookup[x] = lookup.get(x, 0) + 1
return [list(triplet) for triplet in triplets]
First, you need a hash lookup to reduce your O(n^3) algorithm to O(n^2). This is the whole idea, and the rest are micro-optimizations:
the lookup table is build along with the scan on the array, so it is one-pass
the lookup table index on the unique items that seen before, so it handles duplicates efficiently, and by using that, we keep the iteration count of the second-level loop to the minimal
This is an optimized version, will pass through:
from typing import List
class Solution:
def threeSum(self, nums: List[int]) -> List[List[int]]:
unique_triplets = []
nums.sort()
for i in range(len(nums) - 2):
if i > 0 and nums[i] == nums[i - 1]:
continue
lo = i + 1
hi = len(nums) - 1
while lo < hi:
target_sum = nums[i] + nums[lo] + nums[hi]
if target_sum < 0:
lo += 1
if target_sum > 0:
hi -= 1
if target_sum == 0:
unique_triplets.append((nums[i], nums[lo], nums[hi]))
while lo < hi and nums[lo] == nums[lo + 1]:
lo += 1
while lo < hi and nums[hi] == nums[hi - 1]:
hi -= 1
lo += 1
hi -= 1
return unique_triplets
The TLE is most likely for those instances that fall into these two whiles:
while lo < hi and nums[lo] == nums[lo + 1]:
while lo < hi and nums[lo] == nums[lo + 1]:
References
For additional details, please see the Discussion Board where you can find plenty of well-explained accepted solutions with a variety of languages including low-complexity algorithms and asymptotic runtime/memory analysis1, 2.
I'd suggest:
for j in range(i+1, length):
This will save you len(nums)^2/2 steps and first if statement becomes redundant.
sorted_num = sorted([nums[i],nums[j],nums[k]])
if nums[i]+nums[j]+nums[k] == 0 and sorted_num not in data:
sorted_num = sorted([nums[i],nums[j],nums[k]])
data.append(sorted_num)
To avoid unneeded calls to sorted in the innermost loop.
Your solution is the brute force one, and the slowest one.
Better solutions can be:
Assume you start from an element from array. Consider using a Set for finding next two numbers from remaining array.
There is a 3rd better solution as well. See https://www.gyanblog.com/gyan/coding-interview/leetcode-three-sum/

How to count the number of unique numbers in sorted array using Binary Search?

I am trying to count the number of unique numbers in a sorted array using binary search. I need to get the edge of the change from one number to the next to count. I was thinking of doing this without using recursion. Is there an iterative approach?
def unique(x):
start = 0
end = len(x)-1
count =0
# This is the current number we are looking for
item = x[start]
while start <= end:
middle = (start + end)//2
if item == x[middle]:
start = middle+1
elif item < x[middle]:
end = middle -1
#when item item greater, change to next number
count+=1
# if the number
return count
unique([1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,5,5,5,5,5,5,5,5,5,5])
Thank you.
Edit: Even if the runtime benefit is negligent from o(n), what is my binary search missing? It's confusing when not looking for an actual item. How can I fix this?
Working code exploiting binary search (returns 3 for given example).
As discussed in comments, complexity is about O(k*log(n)) where k is number of unique items, so this approach works well when k is small compared with n, and might become worse than linear scan in case of k ~ n
def countuniquebs(A):
n = len(A)
t = A[0]
l = 1
count = 0
while l < n - 1:
r = n - 1
while l < r:
m = (r + l) // 2
if A[m] > t:
r = m
else:
l = m + 1
count += 1
if l < n:
t = A[l]
return count
print(countuniquebs([1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,5,5,5,5,5,5,5,5,5,5]))
I wouldn't quite call it "using a binary search", but this binary divide-and-conquer algorithm works in O(k*log(n)/log(k)) time, which is better than a repeated binary search, and never worse than a linear scan:
def countUniques(A, start, end):
len = end-start
if len < 1:
return 0
if A[start] == A[end-1]:
return 1
if len < 3:
return 2
mid = start + len//2
return countUniques(A, start, mid+1) + countUniques(A, mid, end) - 1
A = [1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,3,4,5,5,5,5,5,5,5,5,5,5]
print(countUniques(A,0,len(A)))

Longest Arithmetic Progression

Given a list of numbers arr (not sorted) , find the Longest Arithmetic Progression in it.
Arrays: Integer a
1 ≤ arr.size() ≤ 10^3. and
-10^9 ≤ arr[i] ≤ 10^9.
Examples:
arr = [7,6,1,9,7,9,5,6,1,1,4,0] -------------- output = [7,6,5,4]
arr = [4,4,6,7,8,13,45,67] -------------- output = [4,6,8]
from itertools import combinations
def arithmeticProgression2(a):
n=len(a)
diff = ((y-x, x) for x, y in combinations(a, 2))
dic=[]
for d, n in diff:
k = []
seq=a
while n in seq:
k.append(n)
i=seq.index(n)
seq=seq[i+1:]
n += d
dic.append(k)
maxx=max([len(k) for k in dic])
for x in dic:
if len(x)==maxx:
return x
in case arr.size() is big enough. my code will be run more than 4000ms.
Example :
arr = [randint(-10**9,10**9) for i in range(10**3)]
runtime > 4000ms
How to reduce the space complexity for the above solution?
One of the things that makes the code slow is that you build series from scratch for each pair, which is not necessary:
you don't actually need to build k each time. If you just keep the step, the length and the start (or end) value of a progression, you know enough. Only build the progression explicitly for the final result
by doing this for each pair, you also create series where the start point is in fact in the middle of a longer series (having the same step), and so you partly do double work, and work that is not useful, as in that case the progression that starts earlier will evidently be longer than the currently analysed one.
It makes your code run in O(n³) time instead of the possible O(n²).
The following seems to return the result much faster in O(n²), using dynamic programming:
def longestprogression(data):
if len(data) < 3:
return data
maxlen = 0 # length of longest progression so far
endvalue = None # last value of longest progression
beststep = None # step of longest progression
# progressions ending in index i, keyed by their step size,
# with the progression length as value
dp = [{} for _ in range(len(data))]
# iterate all possible ending pairs of progressions
for j in range(1, len(arr)):
for i in range(j):
step = arr[j] - arr[i]
if step in dp[i]:
curlen = dp[i][step] + 1
else:
curlen = 2
dp[j][step] = curlen
if curlen > maxlen:
maxlen = curlen
endvalue = arr[j]
beststep = step
# rebuild the longest progression from the values we maintained
return list(reversed(range(endvalue, endvalue - maxlen * beststep, -beststep)))

Longest Increasing Cyclic Subsequence

I have a code for finding the longest increasing subsequence, but I'd like to extend this to allow wrap arounds. For example for the sequence (4,5,6,1,2,3) the longest increasing cyclic subsequence is (1,2,3,4,5,6) since once we reach 3, we can go back to the beginning of the sequence (we can only do this once.) Is anyone able to help me?
Here is the code:
def longest_increasing_subsequence(X):
N = len(X)
P = [0] * N
M = [0] * (N+1)
L = 0
for i in range(N):
lo = 1
hi = L
while lo <= hi:
mid = (lo+hi)//2
if (X[M[mid]] < X[i]):
lo = mid+1
else:
hi = mid-1
newL = lo
P[i] = M[newL-1]
M[newL] = i
if (newL > L):
L = newL
S = []
k = M[L]
for i in range(L-1, -1, -1):
S.append(X[k])
k = P[k]
return len(S[::-1])
Concatenate the sequence to the end of itself, then run your algorithm on it.
Just check the returned value from the function on each shift:
max_increasing=longest_increasing_subsequence(X)
for i in range(len(X)-1):
X=X.append(X.pop(0)) #shift X by 1
if longest_increasing_subsequence(X)>max_increasing:
max_increasing=longest_increasing_subsequence(X)

Optimal Search Tree Using Python - Code Analysis

First of all, sorry about the naive question. But I couldn't find help elsewhere
I'm trying to create an Optimal Search Tree using Dynamic Programing in Python that receives two lists (a set of keys and a set of frequencies) and returns two answers:
1 - The smallest path cost.
2 - The generated tree for that smallest cost.
I basically need to create a tree organized by the most accessed items on top (most accessed item it's the root), and return the smallest path cost from that tree, by using the Dynamic Programming solution.
I've the following implemented code using Python:
def optimalSearchTree(keys, freq, n):
#Create an auxiliary 2D matrix to store results of subproblems
cost = [[0 for x in xrange(n)] for y in xrange(n)]
#For a single key, cost is equal to frequency of the key
#for i in xrange (0,n):
# cost[i][i] = freq[i]
# Now we need to consider chains of length 2, 3, ... .
# L is chain length.
for L in xrange (2,n):
for i in xrange(0,n-L+1):
j = i+L-1
cost[i][j] = sys.maxint
for r in xrange (i,j):
if (r > i):
c = cost[i][r-1] + sum(freq, i, j)
elif (r < j):
c = cost[r+1][j] + sum(freq, i, j)
elif (c < cost[i][j]):
cost[i][j] = c
return cost[0][n-1]
def sum(freq, i, j):
s = 0
k = i
for k in xrange (k,j):
s += freq[k]
return s
keys = [10,12,20]
freq = [34,8,50]
n=sys.getsizeof(keys)/sys.getsizeof(keys[0])
print(optimalSearchTree(keys, freq, n))
I'm trying to output the answer 1. The smallest cost for that tree should be 142 (the value stored on the Matrix Position [0][n-1], according to the Dynamic Programming solution). But unfortunately it's returning 0. I couldn't find any issues in that code. What's going wrong?
You have several very questionable statements in your code, definitely inspired by C/Java programming practices. For instance,
keys = [10,12,20]
freq = [34,8,50]
n=sys.getsizeof(keys)/sys.getsizeof(keys[0])
I think you think you calculate the number of items in the list. However, n is not 3:
sys.getsizeof(keys)/sys.getsizeof(keys[0])
3.142857142857143
What you need is this:
n = len(keys)
One more find: elif (r < j) is always True, because r is in the range between i (inclusive) and j (exclusive). The elif (c < cost[i][j]) condition is never checked. The matrix c is never updated in the loop - that's why you always end up with a 0.
Another suggestion: do not overwrite the built-in function sum(). Your namesake function calculates the sum of all items in a slice of a list:
sum(freq[i:j])
import sys
def optimalSearchTree(keys, freq):
#Create an auxiliary 2D matrix to store results of subproblems
n = len(keys)
cost = [[0 for x in range(n)] for y in range(n)]
storeRoot = [[0 for i in range(n)] for i in range(n)]
#For a single key, cost is equal to frequency of the key
for i in range (0,n):
cost[i][i] = freq[i]
# Now we need to consider chains of length 2, 3, ... .
# L is chain length.
for L in range (2,n+1):
for i in range(0,n-L+1):
j = i + L - 1
cost[i][j] = sys.maxsize
for r in range (i,j+1):
c = (cost[i][r-1] if r > i else 0)
c += (cost[r+1][j] if r < j else 0)
c += sum(freq[i:j+1])
if (c < cost[i][j]):
cost[i][j] = c
storeRoot[i][j] = r
return cost[0][n-1], storeRoot
if __name__ == "__main__" :
keys = [10,12,20]
freq = [34,8,50]
print(optimalSearchTree(keys, freq))

Categories