quickSort algorithm on GeeksForGeeks question - python

I have a question about the quickSort algorithm on the GeeksForGeeks website here: https://www.geeksforgeeks.org/python-program-for-quicksort/
The quickSort consists of the partition function shown on GeeksForGeeks as follows:
def partition(arr, low, high):
i = (low-1) # index of smaller element
pivot = arr[high] # pivot
for j in range(low, high):
# If current element is smaller than or
# equal to pivot
if arr[j] <= pivot:
# increment index of smaller element
i = i+1
arr[i], arr[j] = arr[j], arr[i]
arr[i+1], arr[high] = arr[high], arr[i+1]
return (i+1)
I am wondering why i is set to i = low - 1.
Why can't the function be rewritten like this (Notice all the i's):
def partition(arr, low, high):
i = low
pivot = arr[high]
for j in range(low, high):
if arr[j] <= pivot:
arr[i], arr[j] = arr[j], arr[i]
i += 1
arr[i], arr[high] = arr[high], arr[i]
return i

Your implementation of quicksort works. There are multiple correct implementations of quicksort out there. My guess is that GeeksForGeeks chose this implementation, but as to why they chose it?
You'd have to ask the writer of the article.
I think your question brings up a good point, that is, algorithms can be implemented in different, but similar ways.
I quickly wrote a script to test your version of the quicksort implementation. See it below.
def partition(arr, low, high):
i = low # index of smaller element
pivot = arr[high] # pivot
for j in range(low, high):
# If current element is smaller than or
# equal to pivot
if arr[j] <= pivot:
# increment index of smaller element
arr[i], arr[j] = arr[j], arr[i]
i = i+1
arr[i], arr[high] = arr[high], arr[i]
return i
def quickSort(arr, low, high):
if len(arr) == 1:
return arr
if low < high:
# pi is partitioning index, arr[p] is now
# at right place
pi = partition(arr, low, high)
# Separately sort elements before
# partition and after partition
quickSort(arr, low, pi-1)
quickSort(arr, pi+1, high)
# Driver code to test above
arr = []
for i in range(0, 10000):
import random
r = random.randint(-100000,100000)
arr.append(r)
n = len(arr)
quickSort(arr, 0, n-1)
# using naive method to
# check sorted list
flag = 0
i = 1
while i < len(arr):
if(arr[i] < arr[i - 1]):
flag = 1
i += 1
# printing result
if (not flag) :
print ("Sorted")
else :
print ("Not sorted")
print(arr)

Its all about efficiency, by starting i at low and starting j at low, you are testing a value against itself in the first run of the loop, which does absolutely nothing in the scope of sorting an array.
To change that and keep the efficiency, you would have to change the implementation of the quickSort method (as in the one that calls partition) but by doing that you end up touching the same index multiple times between the two recursive calls, which decreases efficiency again.
Quicksort is all about speed (hence the name). The changes you made, definitely don't break the algorithm, however when you scale up the input it does decrease it's speed.

Related

Attempting to implement the quicksort technique shown in the following video

This is the video I am referring to:
https://youtu.be/ywWBy6J5gz8
This is the code that I have tried in python:
def partition(arr, low, high):
pivot = arr[low]
i = low + 1
j = high
switch = True
while (True):
while switch == True:
while arr[j] >= pivot and i < j:
j -= 1
pos = arr.index(pivot)
arr[j] , arr[pos] = arr[pos], arr[j]
switch = False
while switch == False:
while arr[i] <= pivot and i < j:
i += 1
pos2 = arr.index(pivot)
arr[i] , arr[pos2] = arr[pos2], arr[i]
switch = True
if i >= j:
return j
def quickSort(arr, low, high):
if high - low >= 1:
pivot = partition(arr, low, high)
quickSort(arr, low, pivot - 1)
quickSort(arr, pivot + 1, high)
arr = [3, 1, 5, 7, 6, 2, 4]
quickSort(arr, 0, len(arr) - 1)
print("Sorted array:")
print(arr)
Output:
Sorted array:
[1, 2, 3, 6, 4, 5, 7]
What am I doing wrong??
There are a few issues in your code:
In the video the pivot is one of the two persons that are stepped forward, the one with the dark hat (without the light decoration). There is no third person in this process, yet you define i and j as separate indices from the index where the pivot resides. To harmonise your code with the video you should initialise i to low. The two persons that stepped forward represent i and j in your code.
The partitioning should really end when i >= j, yet if this condition occurs in the first block, then the second block still executes and performs an undesired swap there. You should immediately exit the function when i >= j. If you code it like that, you don't need this switch anymore either.
The call of arr.index(pivot) is not required, nor efficient, nor suggested by the video. You already know where the pivot is, as it is one of the two persons that are stepped forward (dark hat), so i or j.
Here is the corrected code:
def partition(arr, low, high):
pivot = arr[low]
i = low
j = high
while True:
while arr[j] >= pivot:
j -= 1
if i >= j:
return i
arr[j], arr[i] = pivot, arr[j]
while arr[i] <= pivot:
i += 1
if i >= j:
return j
arr[i], arr[j] = pivot, arr[i]
This now implements what is shown in the video.

Recursive Quicksort using Lambdas

I'm trying to implement a recursive quicksort algorithm using two methods (swap, partition) while running the main algorithm using recursion in a lambda expression. I'm getting an infinite recursion error and honestly I can't find the syntax error. Can someone help me out? Thanks :)
def swap(array, a, b):
array[a], array[b] = array[b], array[a]
def partition(array, high, low):
pivot = array[high]
i = low
for x in range(low, high-1):
if array[x] < pivot:
i+=1
swap(array, array[x], array[high])
return i
g = lambda array, low, high: g(array,low,partition(array,high,low)-1)+g(array,partition(array,high,low)+1,high) if low < high else print("not sorted")
There are several issues in partition:
The call to swap is passing values from your list, instead of indices.
Even when the previous mistake is corrected, it will either move the pivot value to the low+1 index, or it will not move at all.
The returned index i, should be the one where the pivot was moved. In a correct implementation that means i is the last index to which a value was moved, which was the value at index high. This is not what is happening, as already with the first swap the pivot value is moved.
The swap should be of the current value with the value at i, so that all values up to the one at index i are less or equal to the pivot value.
Here is the corrected partition function:
def partition(array, high, low):
pivot = array[high]
i = low - 1
for x in range(low, high+1):
if array[x] <= pivot:
i+=1
swap(array, x, i)
return i
These are the issues in the function g:
It is supposed to perform the sort in-place, so the + operator for lists should not occur here, as that would create a new list. Moreover, the base case (in else) does not return anything, so the + operator will fail with an error
partition(array,high,low) is called twice, which is not only a waste, but the second call will in most cases return a different result, because the pivot can be different. This means the second call of g will potentially not work with an adjacent partition, but will either leave an (unsorted) gap, or work on an overlapping partition.
Here is a correction for the function g:
def g(array, low, high):
if low < high:
i = partition(array, high, low)
g(array, low, i-1)
g(array, i+1, high)
You should also consider using a better name than g, and change the order of the high/low parameters for partition: that reversed order is a good way to confuse the readers of your code.
Here is Hoare's quicksort algorithm implemented in Python -
def quicksort(A, lo, hi):
if lo >= 0 and hi >= 0 and lo < hi:
p = partition(A, lo, hi)
quicksort(A, lo, p)
quicksort(A, p + 1, hi)
def partition(A, lo, hi):
pivot = A[(hi + lo) // 2]
i = lo
j = hi
while True:
while A[i] < pivot:
i += 1
while A[j] > pivot:
j -= 1
if i >= j:
return j
swap(A, i, j)
def swap(A, i, j):
A[i], A[j] = A[j], A[i]
You can write g using lambda if you wish, but I would recommend to define an ordinary function instead -
g = lambda a: quicksort(a, 0, len(a) - 1)
Given a sample input, x -
x = [5,0,9,7,4,2,8,3,1,6]
g(x)
print(x)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
See this related Q&A if you would like to count the number of comparisons and swaps used.

Understanding Python - Variable Defined Outside Function, But Changed Inside Function with no return

I recently came across the QuickSort algorithm and found an example for it in Python on Geeksforgeeks here: https://www.geeksforgeeks.org/python-program-for-quicksort/
My question is this: the variable arr is defined outside of the function Quicksort.. so how is the variable known both in and out of the function without global or return?
Sorry if this was basically posted elsewhere or if posting other people's code is a no-no. Not trying to claim this as mine, I just haven't seen Python code do this before. It's not the algorithm itself nor the use of recursive functions.. it's the lack of global or return that confuses me.
def Partition(arr, low, high):
i= low - 1
pivot= arr[high]
for j in range(low, high):
if arr[j] <= pivot:
i += 1
arr[i], arr[j] = arr[j], arr[i]
arr[i+1], arr[high] = arr[high], arr[i+1]
return i+1
def QuickSort(arr, low, high):
if len(arr) == 1:
return arr
if low < high:
pi= Partition(arr, low, high)
QuickSort(arr, low, pi-1)
QuickSort(arr, pi+1, high)
arr = [10, 7, 8, 9, 1, 5]
n= len(arr)
QuickSort(arr, 0, n-1)
print("Sorted array is:")
for i in range(n):
print("%d" % arr[i])
arr is being passed in as an argument here. It is misleading because the variable name in the function is the same as the global one, though it doesn't need to be. For instance, the following code is the same as what you posted:
def Partition(mylst, low, high):
i= low - 1
pivot= mylst[high]
for j in range(low, high):
if mylst[j] <= pivot:
i += 1
mylst[i], mylst[j] = mylst[j], mylst[i]
mylst[i+1], mylst[high] = mylst[high], mylst[i+1]
return i+1
def QuickSort(mylst, low, high):
if len(mylst) == 1:
return mylst
if low < high:
pi= Partition(mylst, low, high)
QuickSort(mylst, low, pi-1)
QuickSort(mylst, pi+1, high)
arr = [10, 7, 8, 9, 1, 5]
n= len(arr)
QuickSort(arr, 0, n-1)
print("Sorted array is:")
for i in range(n):
print("%d" % arr[i])
Even aside from that, as long as you are not changing what a variable is, you can still modify global values in Python. The distinction here is that you're not changing the value of arr, but the value of the actual array it is pointing to, so even if you wanted to modify the global array you would not need the global keyword.
Stuff like this is usually not done often in practice so don't worry too much about it. In this scenario at least, the array is being passed into the function, and the Quicksort function directly modifies the array that is passed into it (and it just so happens to be arr in this example).

Python counting changes and compares in Quick-sort function

I've made a well-known Quick-Sort function and I want to implement counter of changes and compares, but I can't figure it how. I know how to pass variable within two function, but I have no idea how it works when I do it in recursive function.
def Partition(array, low, high):
pivot = array[high]
i = low - 1
for j in range(low, high):
if array[j] <= pivot:
i += 1
array[i], array[j] = array[j], array[i]
array[i + 1], array[high] = array[high], array[i + 1]
return i + 1
def QuickSort(array, low, high):
if low < high:
pivot = Partition(array, low, high)
QuickSort(array, low, pivot - 1)
QuickSort(array, pivot + 1, high)
return array
print(QuickSort(array, 0, len(array) - 1))
Here is an example how I was passing the variable within two different function
def test2(changes, compares):
changes += 1
compares += 1
return changes, compares
def test1(changes = 0, compares = 0):
changes, compares = test2(changes, compares)
return changes, compares

Unbound local error in maximum sub array problem from cormens algorithm

I am trying to follow the Cormen's algorithm approach to solving maximum sum sub array problem using Dynamic programming in Python.For this I have already created a maximum crossing sub array code which is working fine.
def maxcrosssub(arr, low, mid, high):
left_sum = float("-inf")
sum = 0
for i in range(mid, low-1, -1):
sum += arr[i]
if sum > left_sum:
left_sum = sum
max_left = i
right_sum = float("-inf")
sum = 0
for j in range(mid+1, high):
sum += arr[j]
if sum > right_sum:
right_sum = sum
max_right = j
return(max_left, max_right, left_sum+right_sum)
But the problem is in main program.
def max_subarray(arr, low, high):
if low == high:
return (low, high, arr[low])
mid = (low+high)//2
left_low, left_high, left_sum = max_subarray(arr, low, mid)
right_low, right_high, right_sum = max_subarray(arr, mid+1, high)
cross_low, cross_high, cross_sum = maxcrosssub(arr, low, mid, high)
if left_sum >= right_sum and left_sum >= cross_sum:
return(left_low, left_high, left_sum)
elif right_sum >= left_sum and right_sum >= cross_sum:
return(right_low, right_high, right_sum)
else:
return(cross_low, cross_high, cross_sum)
I am getting this error
UnboundLocalError: local variable 'max_right' referenced before assignment.
I have tried using global variable name following some answers in stack overflow but it is still not working.
Can anyone suggest what am i doing wrong?
The problem
The problem is coming from the first function maxcrosssub, precisely from the fact that in some cases max_right is used (referenced) before being initialized (assignment). For example if the condition sum > left_sum is never fulfilled.
Solution
Assign a value to max_right in the beginning (before it's referenced)
Try this
I am trying to follow the Cormen's algorithm approach to solving maximum sum sub array problem using Dynamic programming in Python.For this I have already created a maximum crossing sub array code which is working fine.
def maxcrosssub(arr, low, mid, high):
left_sum = float("-inf")
sum = 0
# Here is where you can assign a value to 'max_right'
max_right = 0 # For example
for i in range(mid, low-1, -1):
sum += arr[i]
if sum > left_sum:
left_sum = sum
max_left = i
right_sum = float("-inf")
sum = 0
for j in range(mid+1, high):
sum += arr[j]
if sum > right_sum:
right_sum = sum
max_right = j
return(max_left, max_right, left_sum+right_sum)

Categories