Is there ever a situation where you should use a recursive merge sort over an iterative merge sort? Originally I thought that iterative approaches to merge sort were typically faster although I could not prove that in my own implementations. But recursion makes a lot more calls on the stack which in turn makes it less memory efficient. What if I have an extremely large dataset that I want to sort? Wouldn't it be bad to use recursion then? Because doesn't excessively deep recursion eventually lead to a stack overflow? Why would you ever use recursive over iterative if it is slower and less memory efficient?
def merge_sort(arr):
if len(arr) <= 1:
return arr
current_size = 1
while current_size < len(arr):
left = 0
while left < len(arr)-1:
mid = left + current_size - 1
right = min((left + 2*current_size - 1), (len(arr)-1))
merged_arr = merge(arr[left : mid + 1], arr[mid + 1 : right + 1])
for i in range(left, right + 1):
arr[i] = merged_arr[i - left]
left = left + current_size*2
current_size = current_size * 2
return arr
def merge(left, right):
result = []
i = 0
j = 0
while i < len(left) and j < len(right):
if left[i] < right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
result += left[i:]
result += right[j:]
return result
def merge_sort_recursive(arr):
if len(arr) <= 1:
return arr
mid = len(arr) // 2
left = arr[:mid]
right = arr[mid:]
left = merge_sort_recursive(left)
right = merge_sort_recursive(right)
return merge(left, right)
def merge(left, right):
result = []
i = 0
j = 0
while i < len(left) and j < len(right):
if left[i] < right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
result += left[i:]
result += right[j:]
return result
Update Hmm, after writing my own much simpler iterative one, I have to somewhat take back some of what I wrote...
def merge_sort_Kelly(arr):
half = 1
while half < len(arr):
for mid in range(half, len(arr), 2*half):
start = mid - half
stop = mid + half
arr[start:stop] = merge(arr[start:mid], arr[mid:stop])
half *= 2
return arr
Times for sorting three shuffled list(range(2**17)) (Try it online!):
1.35 seconds merge_sort
0.91 seconds merge_sort_recursive
0.90 seconds merge_sort_Kelly
1.25 seconds merge_sort
1.05 seconds merge_sort_recursive
0.92 seconds merge_sort_Kelly
1.34 seconds merge_sort
0.81 seconds merge_sort_recursive
0.88 seconds merge_sort_Kelly
It's pretty much as fast and I'd say almost as simple as the recursive one. Even the boundary check for end was unnecessary after all, as Python slicing handles that for me. The imbalance issue remains.
About memory-efficiency: Actually your iterative one takes more memory than your recursive one, not less. Here are allocation peaks during sorting of list(range(2**17)) as measured with tracemalloc (Try it online!):
3,342,704 bytes merge_sort
2,892,479 bytes merge_sort_recursive
2,752,720 bytes merge_sort_Kelly
525,572 bytes merge_sort_Kelly2 (see text below)
The peaks are reached during the final / top-level merge. Your iterative one takes more because when computing the final merged_arr, that variable still holds the previous one. Can be avoided with del merged_arr when it's no longer needed. Then it only takes 2,752,832 bytes. And of course all our solutions could take less memory if we didn't make so many slice copies but rather worked with indexes instead. That's what merge_sort_Kelly2 does. It only copies in its merge function, and only copies one half out and then merges that half and the other half in the original list into the original list.
end of update, original answer:
Why would you ever use recursive over iterative
Mainly because it's simpler/nicer. For example, your recursive one can sort [3, 1, 4] while your iterative one crashes with an IndexError. No doubt because it's more complicated.
The recursive one is also more balanced, needing fewer comparisons. Left and right are always equally large or differ by just one element. For example, for arr = list(range(2**17)), both do 1114112 comparisons, because both are equally perfectly balanced. But with 2**17+1, the iterative one does 1245184 comparisons while the recursive one only does 1114113. Because the iterative one at the end merges 2^17 elements with 1 element (and that one element happens to be the largest).
I timed these two implementations and found iterative does in fact appear to be faster.
I get the opposite. Even for 2^17 elements, so that the iterative one doesn't have the imbalance issue. Times for sorting three lists both ways:
1.23 seconds merge_sort
0.83 seconds merge_sort_recursive
1.25 seconds merge_sort
0.82 seconds merge_sort_recursive
1.19 seconds merge_sort
0.80 seconds merge_sort_recursive
Code:
from random import shuffle
from time import time
for _ in range(3):
arr = list(range(2**17))
shuffle(arr)
for sort in merge_sort, merge_sort_recursive:
copy = arr[:]
t0 = time()
copy = sort(copy)
print(f'{time()-t0:.2f} seconds {sort.__name__}')
assert copy == sorted(arr)
print()
Related
it has been several days of trying to implement a functional recursive merge sort in Python. Aside from that, I want to be able to print out each step of the sorting algorithm. Is there any way to make this Python code run in a functional-paradigm way? This is what I have so far...
def merge_sort(arr, low, high):
# If low is less than high, proceed. Else, array is sorted
if low < high:
mid = (low + high) // 2 # Get the midpoint
merge_sort (arr, low, mid) # Recursively split the left half
merge_sort (arr, mid+1, high) # Recursively split the right half
return merge(arr, low, mid, high) # merge halves together
def merge (arr, low, mid, high):
temp = []
# Copy all the values into a temporary array for displaying
for index, elem in enumerate(arr):
temp.append(elem)
left_p, right_p, i = low, mid+1, low
# While left and right pointer still have elements to read
while left_p <= mid and right_p <= high:
if temp[left_p] <= temp[right_p]: # If the left value is less than the right value. Shift left pointer by 1 unit to the right
arr[i] = temp[left_p]
left_p += 1
else: # Else, place the right value into target array. Shift right pointer by 1 unit to the right
arr[i] = temp[right_p]
right_p += 1
i += 1 # Increment target array pointer
# Copy the rest of the left side of the array into the target array
while left_p <= mid:
arr[i] = temp[left_p]
i += 1
left_p += 1
print(*arr) # Display the current form of the array
return arr
def main():
# Get input from user
arr = [int(input()) for x in range(int(input("Input the number of elements: ")))]
print("Sorting...")
sorted_arr = merge_sort(arr.copy(), 0, len(arr)-1)
print("\nSorted Array")
print(*sorted_arr)
if __name__ == "__main__":
main()
Any help would be appreciated! Thank you.
In a purely functional merge sort, we don't want to mutate any values.
We can define a nice recursive version with zero mutation like so:
def merge(a1, a2):
if len(a1) == 0:
return a2
if len(a2) == 0:
return a1
if a1[0] <= a2[0]:
rec = merge(a1[1:], a2)
return [a1[0]] + rec
rec = merge(a1, a2[1:])
return [a2[0]] + rec
def merge_sort(arr):
if len(arr) <= 1:
return arr
halfway = len(arr) // 2
left = merge_sort(arr[:halfway])
right = merge_sort(arr[halfway:])
return merge(left, right)
You can add a print(arr) to the top of merge_sort to print step-by-step, however technically side-effects will make it impure (although still referentially transparent, in this case). In python, however, you can't separate out the side effects from the pure computation using monads, so if you want to really avoid this print, you'll have to return the layers, and print them at the end :)
Also, this version is technically doing a lot of copies of the list, so it's relatively slow. This can be fixed by using a linked list, and consing / unconsing it. However that's out of scope.
I'm looking for the longest decreasing subsequence of integers in an array. Here I'm using a binary search (which I know is O(logn)), so I figured this code must be O(nlogn). I tried my code on this particular input and it runs in 0.02 seconds. Now, I was searching on the internet and I found this code http://www.geekviewpoint.com/python/dynamic_programming/lds. The author says it takes O(n^2), but on my same input, it actually takes 0.01 seconds to run, which is obviously less than my O(nlogn) algorithm.
def binary_search(arr, l, r, x):
while r-l > 1:
m = l + (r - l) // 2
if arr[m] >= x:
r = m
else:
l = m
return r
def longest_decr_subseq_length(array, size):
table = [0 for i in range(size + 1)]
table[0] = array[n-1]
length = 1
for i in range(size - 1, -1, -1):
if array[i] < table[0]:
table[0] = array[i]
elif array[i] > table[length - 1]:
table[length] = array[i]
length += 1
else:
table[binary_search(table, -1, length - 1, array[i])] = array[i]
return length
lis = [38, 20, 15, 30, 90, 14, 6, 17]
n = len(lis)
print(longest_decr_subseq_length(lis, n))
So, is my algorithm O(n^2), too? Or is it normal that those are the running time? I'm sorry if the question appears silly, but I'm new to algorithms and still a bit confused
Time complexity isn't the same as execution time. It means how execution time will grow as the input data grows. So even with lower time complexity, it may take longer to execute on the same amount of data, but when you increase the amount of input data algorithm with lower time complexity will start working much faster.
As for the complexity of your algorithm, your calculation seems to be correct, it should be O(nlogn)
I'm tryin to design a function that, given an array A of N integers, returns the smallest positive integer (greater than 0) that does not occur in A.
This code works fine yet has a high order of complexity, is there another solution that reduces the order of complexity?
Note: The 10000000 number is the range of integers in array A, I tried the sort function but does it reduces the complexity?
def solution(A):
for i in range(10000000):
if(A.count(i)) <= 0:
return(i)
The following is O(n logn):
a = [2, 1, 10, 3, 2, 15]
a.sort()
if a[0] > 1:
print(1)
else:
for i in range(1, len(a)):
if a[i] > a[i - 1] + 1:
print(a[i - 1] + 1)
break
If you don't like the special handling of 1, you could just append zero to the array and have the same logic handle both cases:
a = sorted(a + [0])
for i in range(1, len(a)):
if a[i] > a[i - 1] + 1:
print(a[i - 1] + 1)
break
Caveats (both trivial to fix and both left as an exercise for the reader):
Neither version handles empty input.
The code assumes there no negative numbers in the input.
O(n) time and O(n) space:
def solution(A):
count = [0] * len(A)
for x in A:
if 0 < x <= len(A):
count[x-1] = 1 # count[0] is to count 1
for i in range(len(count)):
if count[i] == 0:
return i+1
return len(A)+1 # only if A = [1, 2, ..., len(A)]
This should be O(n). Utilizes a temporary set to speed things along.
a = [2, 1, 10, 3, 2, 15]
#use a set of only the positive numbers for lookup
temp_set = set()
for i in a:
if i > 0:
temp_set.add(i)
#iterate from 1 upto length of set +1 (to ensure edge case is handled)
for i in range(1, len(temp_set) + 2):
if i not in temp_set:
print(i)
break
My proposal is a recursive function inspired by quicksort.
Each step divides the input sequence into two sublists (lt = less than pivot; ge = greater or equal than pivot) and decides, which of the sublists is to be processed in the next step. Note that there is no sorting.
The idea is that a set of integers such that lo <= n < hi contains "gaps" only if it has less than (hi - lo) elements.
The input sequence must not contain dups. A set can be passed directly.
# all cseq items > 0 assumed, no duplicates!
def find(cseq, cmin=1):
# cmin = possible minimum not ruled out yet
size = len(cseq)
if size <= 1:
return cmin+1 if cmin in cseq else cmin
lt = []
ge = []
pivot = cmin + size // 2
for n in cseq:
(lt if n < pivot else ge).append(n)
return find(lt, cmin) if cmin + len(lt) < pivot else find(ge, pivot)
test = set(range(1,100))
print(find(test)) # 100
test.remove(42)
print(find(test)) # 42
test.remove(1)
print(find(test)) # 1
Inspired by various solutions and comments above, about 20%-50% faster in my (simplistic) tests than the fastest of them (though I'm sure it could be made faster), and handling all the corner cases mentioned (non-positive numbers, duplicates, and empty list):
import numpy
def firstNotPresent(l):
positive = numpy.fromiter(set(l), dtype=int) # deduplicate
positive = positive[positive > 0] # only keep positive numbers
positive.sort()
top = positive.size + 1
if top == 1: # empty list
return 1
sequence = numpy.arange(1, top)
try:
return numpy.where(sequence < positive)[0][0]
except IndexError: # no numbers are missing, top is next
return top
The idea is: if you enumerate the positive, deduplicated, sorted list starting from one, the first time the index is less than the list value, the index value is missing from the list, and hence is the lowest positive number missing from the list.
This and the other solutions I tested against (those from adrtam, Paritosh Singh, and VPfB) all appear to be roughly O(n), as expected. (It is, I think, fairly obvious that this is a lower bound, since every element in the list must be examined to find the answer.) Edit: looking at this again, of course the big-O for this approach is at least O(n log(n)), because of the sort. It's just that the sort is so fast comparitively speaking that it looked linear overall.
I am trying to solve the 3 Sum problem stated as:
Given an array S of n integers, are there elements a, b, c in S such that a + b + c = 0? Find all unique triplets in the array which gives the sum of zero.
Note: The solution set must not contain duplicate triplets.
Here is my solution to this problem:
def threeSum(nums):
"""
:type nums: List[int]
:rtype: List[List[int]]
"""
nums.sort()
n = len(nums)
solutions = []
for i, num in enumerate(nums):
if i > n - 3:
break
left, right = i+1, n-1
while left < right:
s = num + nums[left] + nums[right] # check if current sum is 0
if s == 0:
new_solution = [num, nums[left], nums[right]]
# add to the solution set only if this triplet is unique
if new_solution not in solutions:
solutions.append(new_solution)
right -= 1
left += 1
elif s > 0:
right -= 1
else:
left += 1
return solutions
This solution works fine with a time complexity of O(n**2 + k) and space complexity of O(k) where n is the size of the input array and k is the number of solutions.
While running this code on LeetCode, I am getting TimeOut error for arrays of large size. I would like to know how can I further optimize my code to pass the judge.
P.S: I have read the discussion in this related question. This did not help me resolve the issue.
A couple of improvements you can make to your algorithm:
1) Use sets instead of a list for your solution. Using a set will insure that you don't have any duplicate and you don't have to do a if new_solution not in solutions: check.
2) Add an edge case check for an all zero list. Not too much overhead but saves a HUGE amount of time for some cases.
3) Change enumerate to a second while. It is a little faster. Weirdly enough I am getting better performance in the test with a while loop then a n_max = n -2; for i in range(0, n_max): Reading this question and answer for xrange or range should be faster.
NOTE: If I run the test 5 times I won't get the same time for any of them. All my test are +-100 ms. So take some of the small optimizations with a grain of salt. They might NOT really be faster for all python programs. They might only be faster for the exact hardware/software config the tests are running on.
ALSO: If you remove all the comments from the code it is a LOT faster HAHAHAH like 300ms faster. Just a funny side effect of however the tests are being run.
I have put in the O() notation into all of the parts of your code that take a lot of time.
def threeSum(nums):
"""
:type nums: List[int]
:rtype: List[List[int]]
"""
# timsort: O(nlogn)
nums.sort()
# Stored val: Really fast
n = len(nums)
# Memory alloc: Fast
solutions = []
# O(n) for enumerate
for i, num in enumerate(nums):
if i > n - 3:
break
left, right = i+1, n-1
# O(1/2k) where k is n-i? Not 100% sure about this one
while left < right:
s = num + nums[left] + nums[right] # check if current sum is 0
if s == 0:
new_solution = [num, nums[left], nums[right]]
# add to the solution set only if this triplet is unique
# O(n) for not in
if new_solution not in solutions:
solutions.append(new_solution)
right -= 1
left += 1
elif s > 0:
right -= 1
else:
left += 1
return solutions
Here is some code that won't time out and is fast(ish). It also hints at a way to make the algorithm WAY faster (Use sets more ;) )
class Solution(object):
def threeSum(self, nums):
"""
:type nums: List[int]
:rtype: List[List[int]]
"""
# timsort: O(nlogn)
nums.sort()
# Stored val: Really fast
n = len(nums)
# Hash table
solutions = set()
# O(n): hash tables are really fast :)
unique_set = set(nums)
# covers a lot of edge cases with 2 memory lookups and 1 hash so it's worth the time
if len(unique_set) == 1 and 0 in unique_set and len(nums) > 2:
return [[0, 0, 0]]
# O(n) but a little faster than enumerate.
i = 0
while i < n - 2:
num = nums[i]
left = i + 1
right = n - 1
# O(1/2k) where k is n-i? Not 100% sure about this one
while left < right:
# I think its worth the memory alloc for the vars to not have to hit the list index twice. Not sure
# how much faster it really is. Might save two lookups per cycle.
left_num = nums[left]
right_num = nums[right]
s = num + left_num + right_num # check if current sum is 0
if s == 0:
# add to the solution set only if this triplet is unique
# Hash lookup
solutions.add(tuple([right_num, num, left_num]))
right -= 1
left += 1
elif s > 0:
right -= 1
else:
left += 1
i += 1
return list(solutions)
I benchamrked the faster code provided by PeterH but I found a faster solution, and the code is simpler too.
class Solution(object):
def threeSum(self, nums):
res = []
nums.sort()
length = len(nums)
for i in xrange(length-2): #[8]
if nums[i]>0: break #[7]
if i>0 and nums[i]==nums[i-1]: continue #[1]
l, r = i+1, length-1 #[2]
while l<r:
total = nums[i]+nums[l]+nums[r]
if total<0: #[3]
l+=1
elif total>0: #[4]
r-=1
else: #[5]
res.append([nums[i], nums[l], nums[r]])
while l<r and nums[l]==nums[l+1]: #[6]
l+=1
while l<r and nums[r]==nums[r-1]: #[6]
r-=1
l+=1
r-=1
return res
https://leetcode.com/problems/3sum/discuss/232712/Best-Python-Solution-(Explained)
What is the fastest way to sort an array of whole integers bigger than 0 and less than 100000 in Python? But not using the built in functions like sort.
Im looking at the possibility to combine 2 sport functions depending on input size.
If you are interested in asymptotic time, then counting sort or radix sort provide good performance.
However, if you are interested in wall clock time you will need to compare performance between different algorithms using your particular data sets, as different algorithms perform differently with different datasets. In that case, its always worth trying quicksort:
def qsort(inlist):
if inlist == []:
return []
else:
pivot = inlist[0]
lesser = qsort([x for x in inlist[1:] if x < pivot])
greater = qsort([x for x in inlist[1:] if x >= pivot])
return lesser + [pivot] + greater
Source: http://rosettacode.org/wiki/Sorting_algorithms/Quicksort#Python
Since you know the range of numbers, you can use Counting Sort which will be linear in time.
Radix sort theoretically runs in linear time (sort time grows roughly in direct proportion to array size ), but in practice Quicksort is probably more suited, unless you're sorting absolutely massive arrays.
If you want to make quicksort a bit faster, you can use insertion sort] when the array size becomes small.
It would probably be helpful to understand the concepts of algorithmic complexity and Big-O notation too.
Early versions of Python used a hybrid of samplesort (a variant of quicksort with large sample size) and binary insertion sort as the built-in sorting algorithm. This proved to be somewhat unstable. S0, from python 2.3 onward uses adaptive mergesort algorithm.
Order of mergesort (average) = O(nlogn).
Order of mergesort (worst) = O(nlogn).
But Order of quick sort (worst) = n*2
if you uses list=[ .............. ]
list.sort() uses mergesort algorithm.
For comparison between sorting algorithm you can read wiki
For detail comparison comp
I might be a little late to the show, but there's an interesting article that compares different sorts at https://www.linkedin.com/pulse/sorting-efficiently-python-lakshmi-prakash
One of the main takeaways is that while the default sort does great we can do a little better with a compiled version of quicksort. This requires the Numba package.
Here's a link to the Github repo:
https://github.com/lprakash/Sorting-Algorithms/blob/master/sorts.ipynb
We can use count sort using a dictionary to minimize the additional space usage, and keep the running time low as well. The count sort is much slower for small sizes of the input array because of the python vs C implementation overhead. The count sort starts to overtake the regular sort when the size of the array (COUNT) is about 1 million.
If you really want huge speedups for smaller size inputs, implement the count sort in C and call it from Python.
(Fixed a bug which Aaron (+1) helped catch ...)
The python only implementation below compares the 2 approaches...
import random
import time
COUNT = 3000000
array = [random.randint(1,100000) for i in range(COUNT)]
random.shuffle(array)
array1 = array[:]
start = time.time()
array1.sort()
end = time.time()
time1 = (end-start)
print 'Time to sort = ', time1*1000, 'ms'
array2 = array[:]
start = time.time()
ardict = {}
for a in array2:
try:
ardict[a] += 1
except:
ardict[a] = 1
indx = 0
for a in sorted(ardict.keys()):
b = ardict[a]
array2[indx:indx+b] = [a for i in xrange(b)]
indx += b
end = time.time()
time2 = (end-start)
print 'Time to count sort = ', time2*1000, 'ms'
print 'Ratio =', time2/time1
The built in functions are best, but since you can't use them have a look at this:
http://en.wikipedia.org/wiki/Quicksort
def sort(l):
p = 0
while(p<len(l)-1):
if(l[p]>l[p+1]):
l[p],l[p+1] = l[p+1],l[p]
if(not(p==0)):
p = p-1
else:
p += 1
return l
this is a algorithm that I created but is really fast. just do sort(l)
l being the list that you want to sort.
#fmark
Some benchmarking of a python merge-sort implementation I wrote against python quicksorts from http://rosettacode.org/wiki/Sorting_algorithms/Quicksort#Python
and from top answer.
Size of the list and size of numbers in list irrelevant
merge sort wins, however it uses builtin int() to floor
import numpy as np
x = list(np.random.rand(100))
# TEST 1, merge_sort
def merge(l, p, q, r):
n1 = q - p + 1
n2 = r - q
left = l[p : p + n1]
right = l[q + 1 : q + 1 + n2]
i = 0
j = 0
k = p
while k < r + 1:
if i == n1:
l[k] = right[j]
j += 1
elif j == n2:
l[k] = left[i]
i += 1
elif left[i] <= right[j]:
l[k] = left[i]
i += 1
else:
l[k] = right[j]
j += 1
k += 1
def _merge_sort(l, p, r):
if p < r:
q = int((p + r)/2)
_merge_sort(l, p, q)
_merge_sort(l, q+1, r)
merge(l, p, q, r)
def merge_sort(l):
_merge_sort(l, 0, len(l)-1)
# TEST 2
def quicksort(array):
_quicksort(array, 0, len(array) - 1)
def _quicksort(array, start, stop):
if stop - start > 0:
pivot, left, right = array[start], start, stop
while left <= right:
while array[left] < pivot:
left += 1
while array[right] > pivot:
right -= 1
if left <= right:
array[left], array[right] = array[right], array[left]
left += 1
right -= 1
_quicksort(array, start, right)
_quicksort(array, left, stop)
# TEST 3
def qsort(inlist):
if inlist == []:
return []
else:
pivot = inlist[0]
lesser = qsort([x for x in inlist[1:] if x < pivot])
greater = qsort([x for x in inlist[1:] if x >= pivot])
return lesser + [pivot] + greater
def test1():
merge_sort(x)
def test2():
quicksort(x)
def test3():
qsort(x)
if __name__ == '__main__':
import timeit
print('merge_sort:', timeit.timeit("test1()", setup="from __main__ import test1, x;", number=10000))
print('quicksort:', timeit.timeit("test2()", setup="from __main__ import test2, x;", number=10000))
print('qsort:', timeit.timeit("test3()", setup="from __main__ import test3, x;", number=10000))
Bucket sort with bucket size = 1. Memory is O(m) where m = the range of values being sorted. Running time is O(n) where n = the number of items being sorted. When the integer type used to record counts is bounded, this approach will fail if any value appears more than MAXINT times.
def sort(items):
seen = [0] * 100000
for item in items:
seen[item] += 1
index = 0
for value, count in enumerate(seen):
for _ in range(count):
items[index] = value
index += 1