Leetcode question '3Sum' algorithm exceeds time limit, looking for improvement - python

Given an array nums of n integers, are there elements a, b, c in nums such that a + b + c = 0? Find all unique triplets in the array which gives the sum of zero.
class Solution:
def threeSum(self, nums):
data = []
i = j = k =0
length = len(nums)
for i in range(length):
for j in range(length):
if j == i:
continue
for k in range(length):
if k == j or k == i:
continue
sorted_num = sorted([nums[i],nums[j],nums[k]])
if nums[i]+nums[j]+nums[k] == 0 and sorted_num not in data:
data.append(sorted_num)
return data
My soulution is working well but it appears that it may be too slow.
Is there a way to improve my codes without changing it significantly?

This is a O(n^2) solution with some optimization tricks:
import itertools
class Solution:
def findsum(self, lookup: dict, target: int):
for u in lookup:
v = target - u
# reduce duplication, we may enforce v <= u
try:
m = lookup[v]
if u != v or m > 1:
yield u, v
except KeyError:
pass
def threeSum(self, nums: List[int]) -> List[List[int]]:
lookup = {}
triplets = set()
for x in nums:
for y, z in self.findsum(lookup, -x):
triplets.add(tuple(sorted([x, y, z])))
lookup[x] = lookup.get(x, 0) + 1
return [list(triplet) for triplet in triplets]
First, you need a hash lookup to reduce your O(n^3) algorithm to O(n^2). This is the whole idea, and the rest are micro-optimizations:
the lookup table is build along with the scan on the array, so it is one-pass
the lookup table index on the unique items that seen before, so it handles duplicates efficiently, and by using that, we keep the iteration count of the second-level loop to the minimal

This is an optimized version, will pass through:
from typing import List
class Solution:
def threeSum(self, nums: List[int]) -> List[List[int]]:
unique_triplets = []
nums.sort()
for i in range(len(nums) - 2):
if i > 0 and nums[i] == nums[i - 1]:
continue
lo = i + 1
hi = len(nums) - 1
while lo < hi:
target_sum = nums[i] + nums[lo] + nums[hi]
if target_sum < 0:
lo += 1
if target_sum > 0:
hi -= 1
if target_sum == 0:
unique_triplets.append((nums[i], nums[lo], nums[hi]))
while lo < hi and nums[lo] == nums[lo + 1]:
lo += 1
while lo < hi and nums[hi] == nums[hi - 1]:
hi -= 1
lo += 1
hi -= 1
return unique_triplets
The TLE is most likely for those instances that fall into these two whiles:
while lo < hi and nums[lo] == nums[lo + 1]:
while lo < hi and nums[lo] == nums[lo + 1]:
References
For additional details, please see the Discussion Board where you can find plenty of well-explained accepted solutions with a variety of languages including low-complexity algorithms and asymptotic runtime/memory analysis1, 2.

I'd suggest:
for j in range(i+1, length):
This will save you len(nums)^2/2 steps and first if statement becomes redundant.
sorted_num = sorted([nums[i],nums[j],nums[k]])
if nums[i]+nums[j]+nums[k] == 0 and sorted_num not in data:
sorted_num = sorted([nums[i],nums[j],nums[k]])
data.append(sorted_num)
To avoid unneeded calls to sorted in the innermost loop.

Your solution is the brute force one, and the slowest one.
Better solutions can be:
Assume you start from an element from array. Consider using a Set for finding next two numbers from remaining array.
There is a 3rd better solution as well. See https://www.gyanblog.com/gyan/coding-interview/leetcode-three-sum/

Related

Binary search when solution is at boundaries?

When working with binary search algorithms, one updates one of the two pointers at each iteration.
However, there are cases like the LeetCode problem where this would miss the solution.
For example, the following solution of threeSumClosest works
class Solution:
def threeSumClosest(self, nums: List[int], target: int) -> int:
nums.sort()
distance = float("inf")
for idx, num in enumerate(nums):
if num >= target:
l = 0
r = idx - 1
else:
l = idx + 1
r = len(nums) -1
while l < r:
res = num + nums[l] + nums[r]
if abs(target-res) < abs(distance):
distance = target - res
if res < target:
l +=1
else:
r -= 1
return target - distance
However, computing mid and using l = mid + 1 or r = mid - 1 misses the solution. How do you handle these cases?
I was expecting that updating l or r to mid +1 or mid -1 the algorithm would find the right solution
This question also appears in other forms, like finding the floor/ceiling of a number in a sorted list, or finding the insertion point of a number in a sorted list. In normal binary search if the middle element doesn’t match the predicate we look left or right, but in all of these problems we need to include it.
For example, given list [1, 2, 4], find insertion point of 3.
class Solution:
def searchInsert(self, nums: List[int], target: int) -> int:
lo = 0
hi = len(nums) - 1
while lo < hi:
mid = lo + (hi - lo) // 2
if nums[mid] == target:
return mid
if nums[mid] < target:
lo = mid + 1
else:
hi = mid
return lo + int(nums[lo] < target)

What should I do? I have a recursive function that returns the mth item of a n-bonacci sequence. It has for loop. I am banned from using loops

I have the following recursive function to get the mth term of a n-bonacci sequence as shown below this question. My problem is that the use of for and while loops is totally banned from the code, so I need to get rid off
for i in range(1, n+1):
temp += n_bonacci(n,m-i)
and convert the code into something that is not a loop but nevertheless achieves the same effect. Among the things I can use, but this is not an exclusive enumeration, is: (1) use built-in functions like sum() and .join() and (2) use list comprehensions.
The full code is as follows:
def n_bonacci(n,m): #n is the number of n initial terms; m is the mth term.
if (m < n-1):
return 0
elif (m == n-1):
return 1
else:
temp = 0
#[temp += n_bonacci(n,m-i) for i in range(n)] #this is an attempt at using list comprehension
for i in range(1, n+1):
temp += n_bonacci(n,m-i)
return temp
print("n_bonacci:",n_bonacci(2,10))
print("n_bonacci:",n_bonacci(7,20))
Here's a solution that avoids any type of loop, including loops hidden inside comprehensions:
def n_bonacci_loopless(n, m):
def inner(i, c):
if i == m:
return sum(c)
else:
return inner(i+1, c[-(n-1):] + [sum(c)])
if m < n-1:
return 0
elif (m == n-1):
return 1
else:
return inner(n, [0] * (n-1) + [1])
The base cases are the same, but for recursive cases it initialises a list of collected results c with n-1 zeroes, followed by a one, the sum of which would be the correct answer for m == n.
For m > n, the inner function is called again as long as i < m, summing the list and appending the result to the end of the last n-1 elements of the list so far.
If you are allowed to use comprehensions, the answer is trivial:
def n_bonacci(n,m):
if (m < n-1):
return 0
elif (m == n-1):
return 1
else:
return sum(n_bonacci(n, m-i) for i in range(1, n+1))
You can rewrite the code as follows using list comprehensions:
def n_bonacci(n,m): #n is the number of n initial terms; m is the mth term.
if (m < n-1):
return 0
elif (m == n-1):
return 1
else:
return sum(n_bonacci(n, m-i) for i in range(1, n + 1))
print("n_bonacci:",n_bonacci(2,10))
print("n_bonacci:",n_bonacci(7,20))
To go beyond #Grismar 's answer you can write your own version of sum which doesn't use loops.
def n_bonacci_loopless(n, m):
def recsum(l, s=0):
return recsum(l[1:], s + l[0])
def inner(i, c):
if i == m:
return recsum(c)
else:
return inner(i+1, c[-(n-1):] + [recsum(c)])
if m < n-1:
return 0
elif (m == n-1):
return 1
else:
return inner(n, [0] * (n-1) + [1])

Leetcode 5: Longes Palindrome Substring

I have been working on the LeetCode problem 5. Longest Palindromic Substring:
Given a string s, return the longest palindromic substring in s.
But I kept getting time limit exceeded on large test cases.
I used dynamic programming as follows:
dp[(i, j)] = True implies that s[i] to s[j] is a palindrome. So if s[i] == str[j] and dp[(i+1, j-1]) is set to True, that means S[i] to S[j] is also a palindrome.
How can I improve the performance of this implementation?
class Solution:
def longestPalindrome(self, s: str) -> str:
dp = {}
res = ""
for i in range(len(s)):
# single character is always a palindrome
dp[(i, i)] = True
res = s[i]
#fill in the table diagonally
for x in range(len(s) - 1):
i = 0
j = x + 1
while j <= len(s)-1:
if s[i] == s[j] and (j - i == 1 or dp[(i+1, j-1)] == True):
dp[(i, j)] = True
if(j-i+1) > len(res):
res = s[i:j+1]
else:
dp[(i, j)] = False
i += 1
j += 1
return res
I think the judging system for this problem is kind of too tight, it took some time to make it pass, improved version:
class Solution:
def longestPalindrome(self, s: str) -> str:
dp = {}
res = ""
for i in range(len(s)):
dp[(i, i)] = True
res = s[i]
for x in range(len(s)): # iterate till the end of the string
for i in range(x): # iterate up to the current state (less work) and for loop looks better here
if s[i] == s[x] and (dp.get((i + 1, x - 1), False) or x - i == 1):
dp[(i, x)] = True
if x - i + 1 > len(res):
res = s[i:x + 1]
return res
Here is another idea to improve the performance:
The nested loop will check over many cases where the DP value is already False for smaller ranges. We can avoid looking at large spans, by looking for palindromes from inside-out and stop extending the span as soon as it no longer is a palindrome. This process should be repeated at every offset in the source string, but this could still save some processing.
The inputs for which then most time is wasted, are those where there are lots of the same letters after each other, like "aaaaaaabcaaaaaaa". These lead to many iterations: each "a" or "aa" could be the center of a palindrome, but "growing" each of them is a waste of time. We should just consider all consecutive "a" together from the start and expand from there onwards.
You can specifically deal with these cases by first grouping consecutive letters which are the same. So the above example would be turned into 4 groups: a(7)b(1)c(1)a(7)
Then let each group in turn be taken as the center of a palindrome. For each group, "fan out" to potentially include one or more neighboring groups at both sides in "tandem". Continue fanning out until either the outside groups are not about the same letter, or they have a different group size. From that result you can derive what the largest palindrome is around that center. In particular, when the case is that the letters of the outer groups are the same, but not their sizes, you still include that letter at the outside of the palindrome, but with a repetition that corresponds to the least of these two mismatching group sizes.
Here is an implementation. I used named tuples to make it more readable:
from itertools import groupby
from collections import namedtuple
Group = namedtuple("Group", "letter,size,end")
class Solution:
def longestPalindrome(self, s: str) -> str:
longest = ""
x = 0
groups = [Group(group[0], len(group), x := x + len(group)) for group in
("".join(group[1]) for group in groupby(s))]
for i in range(len(groups)):
for j in range(0, min(i+1, len(groups) - i)):
if groups[i - j].letter != groups[i + j].letter:
break
left = groups[i - j]
right = groups[i + j]
if left.size != right.size:
break
size = right.end - (left.end - left.size) - abs(left.size - right.size)
if size > len(longest):
x = left.end - left.size + max(0, left.size - right.size)
longest = s[x:x+size]
return longest
Alternatively, you can try this approach, it seems to be faster than 96% Python submission.
def longestPalindrome(self, s: str) -> str:
N = len(s)
if N == 0:
return 0
max_len, start = 1, 0
for i in range(N):
df = i - max_len
if df >= 1 and s[df-1: i+1] == s[df-1: i+1][::-1]:
start = df - 1
max_len += 2
continue
if df >= 0 and s[df: i+1] == s[df: i+1][::-1]:
start= df
max_len += 1
return s[start: start + max_len]
If you want to improve the performance, you should create a variable for len(s) at the beginning of the function and use it. That way instead of calling len(s) 3 times, you would do it just once.
Also, I see no reason to create a class for this function. A simple function will outrun a class method, albeit very slightly.

Optimizing solution to Three Sum

I am trying to solve the 3 Sum problem stated as:
Given an array S of n integers, are there elements a, b, c in S such that a + b + c = 0? Find all unique triplets in the array which gives the sum of zero.
Note: The solution set must not contain duplicate triplets.
Here is my solution to this problem:
def threeSum(nums):
"""
:type nums: List[int]
:rtype: List[List[int]]
"""
nums.sort()
n = len(nums)
solutions = []
for i, num in enumerate(nums):
if i > n - 3:
break
left, right = i+1, n-1
while left < right:
s = num + nums[left] + nums[right] # check if current sum is 0
if s == 0:
new_solution = [num, nums[left], nums[right]]
# add to the solution set only if this triplet is unique
if new_solution not in solutions:
solutions.append(new_solution)
right -= 1
left += 1
elif s > 0:
right -= 1
else:
left += 1
return solutions
This solution works fine with a time complexity of O(n**2 + k) and space complexity of O(k) where n is the size of the input array and k is the number of solutions.
While running this code on LeetCode, I am getting TimeOut error for arrays of large size. I would like to know how can I further optimize my code to pass the judge.
P.S: I have read the discussion in this related question. This did not help me resolve the issue.
A couple of improvements you can make to your algorithm:
1) Use sets instead of a list for your solution. Using a set will insure that you don't have any duplicate and you don't have to do a if new_solution not in solutions: check.
2) Add an edge case check for an all zero list. Not too much overhead but saves a HUGE amount of time for some cases.
3) Change enumerate to a second while. It is a little faster. Weirdly enough I am getting better performance in the test with a while loop then a n_max = n -2; for i in range(0, n_max): Reading this question and answer for xrange or range should be faster.
NOTE: If I run the test 5 times I won't get the same time for any of them. All my test are +-100 ms. So take some of the small optimizations with a grain of salt. They might NOT really be faster for all python programs. They might only be faster for the exact hardware/software config the tests are running on.
ALSO: If you remove all the comments from the code it is a LOT faster HAHAHAH like 300ms faster. Just a funny side effect of however the tests are being run.
I have put in the O() notation into all of the parts of your code that take a lot of time.
def threeSum(nums):
"""
:type nums: List[int]
:rtype: List[List[int]]
"""
# timsort: O(nlogn)
nums.sort()
# Stored val: Really fast
n = len(nums)
# Memory alloc: Fast
solutions = []
# O(n) for enumerate
for i, num in enumerate(nums):
if i > n - 3:
break
left, right = i+1, n-1
# O(1/2k) where k is n-i? Not 100% sure about this one
while left < right:
s = num + nums[left] + nums[right] # check if current sum is 0
if s == 0:
new_solution = [num, nums[left], nums[right]]
# add to the solution set only if this triplet is unique
# O(n) for not in
if new_solution not in solutions:
solutions.append(new_solution)
right -= 1
left += 1
elif s > 0:
right -= 1
else:
left += 1
return solutions
Here is some code that won't time out and is fast(ish). It also hints at a way to make the algorithm WAY faster (Use sets more ;) )
class Solution(object):
def threeSum(self, nums):
"""
:type nums: List[int]
:rtype: List[List[int]]
"""
# timsort: O(nlogn)
nums.sort()
# Stored val: Really fast
n = len(nums)
# Hash table
solutions = set()
# O(n): hash tables are really fast :)
unique_set = set(nums)
# covers a lot of edge cases with 2 memory lookups and 1 hash so it's worth the time
if len(unique_set) == 1 and 0 in unique_set and len(nums) > 2:
return [[0, 0, 0]]
# O(n) but a little faster than enumerate.
i = 0
while i < n - 2:
num = nums[i]
left = i + 1
right = n - 1
# O(1/2k) where k is n-i? Not 100% sure about this one
while left < right:
# I think its worth the memory alloc for the vars to not have to hit the list index twice. Not sure
# how much faster it really is. Might save two lookups per cycle.
left_num = nums[left]
right_num = nums[right]
s = num + left_num + right_num # check if current sum is 0
if s == 0:
# add to the solution set only if this triplet is unique
# Hash lookup
solutions.add(tuple([right_num, num, left_num]))
right -= 1
left += 1
elif s > 0:
right -= 1
else:
left += 1
i += 1
return list(solutions)
I benchamrked the faster code provided by PeterH but I found a faster solution, and the code is simpler too.
class Solution(object):
def threeSum(self, nums):
res = []
nums.sort()
length = len(nums)
for i in xrange(length-2): #[8]
if nums[i]>0: break #[7]
if i>0 and nums[i]==nums[i-1]: continue #[1]
l, r = i+1, length-1 #[2]
while l<r:
total = nums[i]+nums[l]+nums[r]
if total<0: #[3]
l+=1
elif total>0: #[4]
r-=1
else: #[5]
res.append([nums[i], nums[l], nums[r]])
while l<r and nums[l]==nums[l+1]: #[6]
l+=1
while l<r and nums[r]==nums[r-1]: #[6]
r-=1
l+=1
r-=1
return res
https://leetcode.com/problems/3sum/discuss/232712/Best-Python-Solution-(Explained)

how to apply certain list index to sort in python

I have been asked to sort a k messed array
I have below code. I have to reduce the complexity from nlogn to nlogk.
arr = [3,2,1,4,5,6,8,10,9]
k = 2
def sortKmessedarr(arr, k):
i = 1
j = 0
n = len(arr)
while i < n:
if arr[i] > arr[i-1]:
pass
else:
arr[i-1:i+k].sort() # How to sort elements between two specific indexs
i += 1
sortKmessedarr(arr, k)
print(arr)
I think if I apply this approach then it will become nlogk
But how to apply this sort() between two indexes.
I have also tried another approach like below:
arr = [3,2,1,4,5,6,8,10,9]
k = 2
def sortKmessedarr(arr, k):
def merge(arr):
arr.sort()
print(arr)
i = 1
j = 0
n = len(arr)
while i < n:
if arr[i] > arr[i-1]:
pass
else:
merge(arr[i-1:i+k])#.sort()
i += 1
sortKmessedarr(arr, k)
print(arr)
But still no luck
You can use sorted with slice assignment to get the intended effect syntactically, but I am unsure of the impact on performance (memory or speed):
arr[i-1:i+k] = sorted(a[i-1:i+k])

Categories