leetcode two sum problem algorithm efficiency - python

Problem: Given an array of integers nums and an integer target, return indices of the two numbers such that they add up to target.
You may assume that each input would have exactly one solution, and you may not use the same element twice.
You can return the answer in any order.
Input: nums = [2,7,11,15],
target = 9
Output: [0,1]
Explanation: Because nums[0] + nums[1] == 9, we return [0, 1].
Here's my code:
def twoSum(nums, target):
cnt = 0
i = 0
while cnt < len(nums):
temp = 0
if i == cnt:
i += 1
else:
temp = nums[cnt] + nums[i]
if temp == target and i < cnt:
return [i,cnt]
i += 1
if i == len(nums)-1:
i = 0
cnt += 1
The code seems to work fine for 55/57 test cases. But it doesn't work for really big input cases. But i don't understand why this is happening because i have used only one loop and the time complexity should be O(N) which is efficient enough to run in the given time. So any idea what am i missing? And what can i do to make the algorithm more efficient?

You can make a dictionary of the last position of the complement value of each number. Then use it to find the position of the value for which the complement exists in the list (at a greater index in case you have a value that is half the target):
nums = [2,7,11,15]
target = 9
pos = {target-n:i for i,n in enumerate(nums)}
sol = next([i,pos[n]] for i,n in enumerate(nums) if i<pos.get(n,i))
print(sol)
[0, 1]
This works in O(n) time and space

if we`re not talking about space complexity:
def search(values, target):
hashmap = {}
for i in range(len(values)):
current = values[i]
if target - current in hashmap:
return current, hahsmap[target - current]
hashmap[current] = i
return None

Your code isn't really O(n), it's actually O(n^2) in disguise.
You go through i O(n) times for each cnt (and then reset i back to 0), and go through cnt O(n) times.
For a more efficient algorithm, sites like this one (https://www.educative.io/edpresso/how-to-implement-the-two-sum-problem-in-python) have it down pretty well.

I am not sure of the time complexity but I think this solution will be better. p1 and p2 act as two pointers of indexes:
def twoSum(nums, target):
nums2 = nums[:]
nums2.sort()
p1 = 0
p2 = len(nums2)-1
while nums2[p1]+nums2[p2]!=target:
if nums2[p1]+nums2[p2]<target:
p1 += 1
elif nums2[p1]+nums2[p2]>target:
p2 -= 1
return nums.index(nums2[p1]), nums.index(nums2[p2])

Related

Count the number of sub arrays such that each element in the subarray appears at least twice

I have come across a problem and can't seem to come up with an efficient solution. The problem is as follows:
Given an array A, count the number of consecutive contiguous subarrays such that each element in the subarray appears at least twice.
Ex, for:
A = [0,0,0]
The answer should be 3 because we have
A[0..1] = [0,0]
A[1..2] = [0,0]
A[0..3] = [0,0,0]
Another example:
A=[1,2,1,2,3]
The answer should be 1 because we have:
A[0..3] = [1,2,1,2]
I can't seem to come up with an efficient solution for this algorithm. I have an algorithm that checks every possible subarray (O(n^2)), but I need something better. This is my naive solution:
def duplicatesOnSegment(arr):
total = 0
for i in range(0,len(arr)):
unique = 0
test = {}
for j in range(i,len(arr)):
k = arr[j]
if k not in test:
test[k] = 1
else:
test[k] += 1
if test[k] == 1:
unique += 1
elif test[k] == 2:
unique -= 1
if unique == 0:
total += 1
return total
Your program in the worst case is greater than O(n^2) as you are using if k not in test in nested loop. This in in worst case is O(n) resulting in overall worst case O(n^3). I have this, O(n^2) in worst, solution which uses collections.defaultdict as hash to make this faster.
from collections import defaultdict
def func(A):
result = 0
for i in range(0,len(A)):
counter = 0
hash = defaultdict (int)
for j in range (i, len(A)):
hash[A[j]] += 1
if hash[A[j]] == 2:
counter += 1
if counter != 0 and counter == len(hash):
result += 1
return result
To start with, I would consider the negation of the property of interest :
not(all(x in l appears at least twice)) = exists(x in l such that any other y in l != x)
Not sure yet that the previous reformulation of your question might help, but my intuition as a mathematician told me to try this way... Still feels like O(n^2) though...
my_list = ['b', 'b', 'a', 'd', 'd', 'c']
def remaining_list(el,list):
assert el in list
if el == list[-1]: return []
else: return list[list.index(el)+1:]
def has_duplicate(el, list):
assert el in list
return el in remaining_list(el,list)
list(filter(lambda e:not(has_duplicate(e,my_list)),my_list))

How to count the number of unique numbers in sorted array using Binary Search?

I am trying to count the number of unique numbers in a sorted array using binary search. I need to get the edge of the change from one number to the next to count. I was thinking of doing this without using recursion. Is there an iterative approach?
def unique(x):
start = 0
end = len(x)-1
count =0
# This is the current number we are looking for
item = x[start]
while start <= end:
middle = (start + end)//2
if item == x[middle]:
start = middle+1
elif item < x[middle]:
end = middle -1
#when item item greater, change to next number
count+=1
# if the number
return count
unique([1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,5,5,5,5,5,5,5,5,5,5])
Thank you.
Edit: Even if the runtime benefit is negligent from o(n), what is my binary search missing? It's confusing when not looking for an actual item. How can I fix this?
Working code exploiting binary search (returns 3 for given example).
As discussed in comments, complexity is about O(k*log(n)) where k is number of unique items, so this approach works well when k is small compared with n, and might become worse than linear scan in case of k ~ n
def countuniquebs(A):
n = len(A)
t = A[0]
l = 1
count = 0
while l < n - 1:
r = n - 1
while l < r:
m = (r + l) // 2
if A[m] > t:
r = m
else:
l = m + 1
count += 1
if l < n:
t = A[l]
return count
print(countuniquebs([1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,5,5,5,5,5,5,5,5,5,5]))
I wouldn't quite call it "using a binary search", but this binary divide-and-conquer algorithm works in O(k*log(n)/log(k)) time, which is better than a repeated binary search, and never worse than a linear scan:
def countUniques(A, start, end):
len = end-start
if len < 1:
return 0
if A[start] == A[end-1]:
return 1
if len < 3:
return 2
mid = start + len//2
return countUniques(A, start, mid+1) + countUniques(A, mid, end) - 1
A = [1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,3,4,5,5,5,5,5,5,5,5,5,5]
print(countUniques(A,0,len(A)))

Optimizing solution to Three Sum

I am trying to solve the 3 Sum problem stated as:
Given an array S of n integers, are there elements a, b, c in S such that a + b + c = 0? Find all unique triplets in the array which gives the sum of zero.
Note: The solution set must not contain duplicate triplets.
Here is my solution to this problem:
def threeSum(nums):
"""
:type nums: List[int]
:rtype: List[List[int]]
"""
nums.sort()
n = len(nums)
solutions = []
for i, num in enumerate(nums):
if i > n - 3:
break
left, right = i+1, n-1
while left < right:
s = num + nums[left] + nums[right] # check if current sum is 0
if s == 0:
new_solution = [num, nums[left], nums[right]]
# add to the solution set only if this triplet is unique
if new_solution not in solutions:
solutions.append(new_solution)
right -= 1
left += 1
elif s > 0:
right -= 1
else:
left += 1
return solutions
This solution works fine with a time complexity of O(n**2 + k) and space complexity of O(k) where n is the size of the input array and k is the number of solutions.
While running this code on LeetCode, I am getting TimeOut error for arrays of large size. I would like to know how can I further optimize my code to pass the judge.
P.S: I have read the discussion in this related question. This did not help me resolve the issue.
A couple of improvements you can make to your algorithm:
1) Use sets instead of a list for your solution. Using a set will insure that you don't have any duplicate and you don't have to do a if new_solution not in solutions: check.
2) Add an edge case check for an all zero list. Not too much overhead but saves a HUGE amount of time for some cases.
3) Change enumerate to a second while. It is a little faster. Weirdly enough I am getting better performance in the test with a while loop then a n_max = n -2; for i in range(0, n_max): Reading this question and answer for xrange or range should be faster.
NOTE: If I run the test 5 times I won't get the same time for any of them. All my test are +-100 ms. So take some of the small optimizations with a grain of salt. They might NOT really be faster for all python programs. They might only be faster for the exact hardware/software config the tests are running on.
ALSO: If you remove all the comments from the code it is a LOT faster HAHAHAH like 300ms faster. Just a funny side effect of however the tests are being run.
I have put in the O() notation into all of the parts of your code that take a lot of time.
def threeSum(nums):
"""
:type nums: List[int]
:rtype: List[List[int]]
"""
# timsort: O(nlogn)
nums.sort()
# Stored val: Really fast
n = len(nums)
# Memory alloc: Fast
solutions = []
# O(n) for enumerate
for i, num in enumerate(nums):
if i > n - 3:
break
left, right = i+1, n-1
# O(1/2k) where k is n-i? Not 100% sure about this one
while left < right:
s = num + nums[left] + nums[right] # check if current sum is 0
if s == 0:
new_solution = [num, nums[left], nums[right]]
# add to the solution set only if this triplet is unique
# O(n) for not in
if new_solution not in solutions:
solutions.append(new_solution)
right -= 1
left += 1
elif s > 0:
right -= 1
else:
left += 1
return solutions
Here is some code that won't time out and is fast(ish). It also hints at a way to make the algorithm WAY faster (Use sets more ;) )
class Solution(object):
def threeSum(self, nums):
"""
:type nums: List[int]
:rtype: List[List[int]]
"""
# timsort: O(nlogn)
nums.sort()
# Stored val: Really fast
n = len(nums)
# Hash table
solutions = set()
# O(n): hash tables are really fast :)
unique_set = set(nums)
# covers a lot of edge cases with 2 memory lookups and 1 hash so it's worth the time
if len(unique_set) == 1 and 0 in unique_set and len(nums) > 2:
return [[0, 0, 0]]
# O(n) but a little faster than enumerate.
i = 0
while i < n - 2:
num = nums[i]
left = i + 1
right = n - 1
# O(1/2k) where k is n-i? Not 100% sure about this one
while left < right:
# I think its worth the memory alloc for the vars to not have to hit the list index twice. Not sure
# how much faster it really is. Might save two lookups per cycle.
left_num = nums[left]
right_num = nums[right]
s = num + left_num + right_num # check if current sum is 0
if s == 0:
# add to the solution set only if this triplet is unique
# Hash lookup
solutions.add(tuple([right_num, num, left_num]))
right -= 1
left += 1
elif s > 0:
right -= 1
else:
left += 1
i += 1
return list(solutions)
I benchamrked the faster code provided by PeterH but I found a faster solution, and the code is simpler too.
class Solution(object):
def threeSum(self, nums):
res = []
nums.sort()
length = len(nums)
for i in xrange(length-2): #[8]
if nums[i]>0: break #[7]
if i>0 and nums[i]==nums[i-1]: continue #[1]
l, r = i+1, length-1 #[2]
while l<r:
total = nums[i]+nums[l]+nums[r]
if total<0: #[3]
l+=1
elif total>0: #[4]
r-=1
else: #[5]
res.append([nums[i], nums[l], nums[r]])
while l<r and nums[l]==nums[l+1]: #[6]
l+=1
while l<r and nums[r]==nums[r-1]: #[6]
r-=1
l+=1
r-=1
return res
https://leetcode.com/problems/3sum/discuss/232712/Best-Python-Solution-(Explained)

a function that returns number of sums of a certain number.py

I need to write a function that returns the number of ways of reaching a certain number by adding numbers of a list. For example:
print(p([3,5,8,9,11,12,20], 20))
should return:5
The code I wrote is:
def pow(lis):
power = [[]]
for lst in lis:
for po in power:
power = power + [list(po)+[lst]]
return power
def p(lst, n):
counter1 = 0
counter2 = 0
power_list = pow(lst)
print(power_list)
for p in power_list:
for j in p:
counter1 += j
if counter1 == n:
counter2 += 1
counter1 == 0
else:
counter1 == 0
return counter2
pow() is a function that returns all of the subsets of the list and p should return the number of ways to reach the number n. I keep getting an output of zero and I don't understand why. I would love to hear your input for this.
Thanks in advance.
There are two typos in your code: counter1 == 0 is a boolean, it does not reset anything.
This version should work:
def p(lst, n):
counter2 = 0
power_list = pow(lst)
for p in power_list:
counter1 = 0 #reset the counter for every new subset
for j in p:
counter1 += j
if counter1 == n:
counter2 += 1
return counter2
As tobias_k and Faibbus mentioned, you have a typo: counter1 == 0 instead of counter1 = 0, in two places. The counter1 == 0 produces a boolean object of True or False, but since you don't assign the result of that expression the result gets thrown away. It doesn't raise a SyntaxError, since an expression that isn't assigned is legal Python.
As John Coleman and B. M. mention it's not efficient to create the full powerset and then test each subset to see if it has the correct sum. This approach is ok if the input sequence is small, but it's very slow for even moderately sized sequences, and if you actually create a list containing the subsets rather than using a generator and testing the subsets as they're yielded you'll soon run out of RAM.
B. M.'s first solution is quite efficient since it doesn't produce subsets that are larger than the target sum. (I'm not sure what B. M. is doing with that dict-based solution...).
But we can enhance that approach by sorting the list of sums. That way we can break out of the inner for loop as soon as we detect a sum that's too high. True, we need to sort the sums list on each iteration of the outer for loop, but fortunately Python's TimSort is very efficient, and it's optimized to handle sorting a list that contains sorted sub-sequences, so it's ideal for this application.
def subset_sums(seq, goal):
sums = [0]
for x in seq:
subgoal = goal - x
temp = []
for y in sums:
if y > subgoal:
break
temp.append(y + x)
sums.extend(temp)
sums.sort()
return sum(1 for y in sums if y == goal)
# test
lst = [3, 5, 8, 9, 11, 12, 20]
total = 20
print(subset_sums(lst, total))
lst = range(1, 41)
total = 70
print(subset_sums(lst, total))
output
5
28188
With lst = range(1, 41) and total = 70, this code is around 3 times faster than the B.M. lists version.
A one pass solution with one counter, which minimize additions.
def one_pass_sum(L,target):
sums = [0]
cnt = 0
for x in L:
for y in sums[:]:
z = x+y
if z <= target :
sums.append(z)
if z == target : cnt += 1
return cnt
This way if n=len(L), you make less than 2^n additions against n/2 * 2^n by calculating all the sums.
EDIT :
A more efficient solution, that just counts ways. The idea is to see that if there is k ways to make z-x, there is k more way to do z when x arise.
def enhanced_sum_with_lists(L,target):
cnt=[1]+[0]*target # 1 way to make 0
for x in L:
for z in range(target,x-1,-1): # [target, ..., x+1, x]
cnt[z] += cnt[z-x]
return cnt[target]
But order is important : z must be considered descendant here, to have the good counts (Thanks to PM 2Ring).
This can be very fast (n*target additions) for big lists.
For example :
>>> enhanced_sum_with_lists(range(1,100),2500)
875274644371694133420180815
is obtained in 61 ms. It will take the age of the universe to compute it by the first method.
from itertools import chain, combinations
def powerset_generator(i):
for subset in chain.from_iterable(combinations(i, r) for r in range(len(i)+1)):
yield set(subset)
def count_sum(s, cnt):
return sum(1 for i in powerset_generator(s) if sum(k for k in i) == cnt)
print(count_sum(set([3,5,8,9,11,12,20]), 20))

Subset sum algorithm a little faster than 2^(n/2) in worst time?

After analyzing the fastest subset sum algorithm which runs in 2^(n/2) time, I noticed a slight optimization that can be done. I'm not sure if it really counts as an optimization and if it does, I'm wondering if it can be improved by recursion.
Basically from the original algorithm: http://en.wikipedia.org/wiki/Subset_sum_problem (see part with title Exponential time algorithm)
it takes the list and splits it into two
then it generates the sorted power sets of both in 2^(n/2) time
then it does a linear search in both lists to see if 1 value in both lists sum to x using a clever trick
In my version with the optimization
it takes the list and removes the last element last
then it splits the list in two
then it generates the sorted power sets of both in 2^((n-1)/2) time
then it does a linear search in both lists to see if 1 value in both lists sum to x or x-last (at same time with same running time) using a clever trick
If it finds either, then I will know it worked. I tried using python time functions to test with lists of size 22, and my version is coming like twice as fast apparently.
After running the below code, it shows
0.050999879837 <- the original algorithm
0.0250000953674 <- my algorithm
My logic for the recursion part is, well if it works for a size n list in 2^((n-1)/1) time, can we not repeat this again and again?
Does any of this make sense, or am I totally wrong?
Thanks
I created this python code:
from math import log, ceil, floor
import helper # my own code
from random import randint, uniform
import time
# gets a list of unique random floats
# s = how many random numbers
# l = smallest float can be
# h = biggest float can be
def getRandomList(s, l, h):
lst = []
while len(lst) != s:
r = uniform(l,h)
if not r in lst:
lst.append(r)
return lst
# This just generates the two powerset sorted lists that the 2^(n/2) algorithm makes.
# This is just a lazy way of doing it, this running time is way worse, but since
# this can be done in 2^(n/2) time, I just pretend its that running time lol
def getSortedPowerSets(lst):
n = len(lst)
l1 = lst[:n/2]
l2 = lst[n/2:]
xs = range(2**(n/2))
ys1 = helper.getNums(l1, xs)
ys2 = helper.getNums(l2, xs)
return ys1, ys2
# this just checks using the regular 2^(n/2) algorithm to see if two values
# sum to the specified value
def checkListRegular(lst, x):
lst1, lst2 = getSortedPowerSets(lst)
left = 0
right = len(lst2)-1
while left < len(lst1) and right >= 0:
sum = lst1[left] + lst2[right]
if sum < x:
left += 1
elif sum > x:
right -= 1
else:
return True
return False
# this is my improved version of the above version
def checkListSmaller(lst, x):
last = lst.pop()
x1, x2 = x, x - last
return checkhelper(lst, x1, x2)
# this is the same as the function 'checkListRegular', but it checks 2 values
# at the same time
def checkhelper(lst, x1, x2):
lst1, lst2 = getSortedPowerSets(lst)
left = [0,0]
right = [len(lst2)-1, len(lst2)-1]
while 1:
check = 0
if left[0] < len(lst1) and right[0] >= 0:
check += 1
sum = lst1[left[0]] + lst2[right[0]]
if sum < x1:
left[0] += 1
elif sum > x1:
right[0] -= 1
else:
return True
if left[1] < len(lst1) and right[1] >= 0:
check += 1
sum = lst1[left[1]] + lst2[right[1]]
if sum < x2:
left[1] += 1
elif sum > x2:
right[1] -= 1
else:
return True
if check == 0:
return False
n = 22
lst = getRandomList(n, 1, 3000)
startTime = time.time()
print checkListRegular(lst, -50) # -50 so it does worst case scenario
startTime2 = time.time()
print checkListSmaller(lst, -50) # -50 so it does worst case scenario
startTime3 = time.time()
print (startTime2 - startTime)
print (startTime3 - startTime2)
This is the helper library which I just use to generate the powerset list.
def dec_to_bin(x):
return int(bin(x)[2:])
def getNums(lst, xs):
sums = []
n = len(lst)
for i in xs:
bin = str(dec_to_bin(i))
bin = (n-len(bin))*"0" + bin
chosen_items = getList(bin, lst)
sums.append(sum(chosen_items))
sums.sort()
return sums
def getList(binary, lst):
s = []
for i in range(len(binary)):
if binary[i]=="1":
s.append(float(lst[i]))
return s
then it generates the sorted power sets of both in 2^((n-1)/2) time
OK, since now the list has one less lement. However, this is not a big deal its just a constant time improvement of 2^(1/2)...
then it does a linear search in both lists to see if 1 value in both lists sum to x or x-last (at same time with same running time) using a clever trick
... and this improvement will go away because now you do twice as many operations to check for both x and x-last sums instead of only for x
can we not repeat this again and again?
No you can't, for the same reason why you couldn't split the original algorithm again and again. The trick only works for once because once you start looking for values in more than two lists you can't use the sorting trick anymore.

Categories