I have 2 functions in python to sort a list using quicksort
import datetime
def partition(arr, low, high):
i = (low - 1) # index of smaller element
pivot = arr[high] # pivot
for j in range(low, high):
# If current element is smaller than or
# equal to pivot
if arr[j] <= pivot:
# increment index of smaller element
i = i + 1
arr[i], arr[j] = arr[j], arr[i]
arr[i + 1], arr[high] = arr[high], arr[i + 1]
return (i + 1)
def quickSort(arr, low, high):
dt_started = datetime.datetime.utcnow()
if low < high:
# pi is partitioning index, arr[p] is now
# at right place
pi = partition(arr, low, high)
# Separately sort elements before
# partition and after partition
quickSort(arr, low, pi - 1)
quickSort(arr, pi + 1, high)
dt_ended = datetime.datetime.utcnow()
total_time = (dt_ended - dt_started).total_seconds()
return total_time
Where dt_started is the start time for the function and dt_ended is the end time.
From my main, i am calling the function like this:
total_time=quickSort(arr,0,n-1)
where arr is the list I want to sort and n is its size.
My question is, will the quickSort() function return the correct running time as there will also be multiple recursive calls in the function.
Yes, it will. The variables are local to the function (at each call), so even if you call recursively the function, they are not overwritten.
You can test it with putting some trace in the code. For example, you can print the values of dt_started and dt_ended in your code (I'm not going into what the code itself does):
def quickSort(arr, low, high):
print '>> called quickSort'
print ' low:', low
print ' high:', high
dt_started = datetime.datetime.utcnow()
print 'dt_started:', dt_started
if low < high:
pi = partition(arr, low, high)
quickSort(arr, low, pi - 1)
quickSort(arr, pi + 1, high)
print '> other calls finished'
print ' low:', low
print ' high:', high
print 'dt_started:', dt_started
dt_ended = datetime.datetime.utcnow()
print ' dt_ended:', dt_ended
total_time = (dt_ended - dt_started).total_seconds()
return total_time
Running this function with a small array:
In [4]: a = range(1,100)
In [5]: quickSort(a,0,1)
>> called quickSort
low: 0
high: 1
dt_started: 2019-03-26 14:59:41.840875
>> called quickSort
low: 0
high: 0
dt_started: 2019-03-26 14:59:41.840914
> other calls finished
low: 0
high: 0
dt_started: 2019-03-26 14:59:41.840914
dt_ended: 2019-03-26 14:59:41.841138
>> called quickSort
low: 2
high: 1
dt_started: 2019-03-26 14:59:41.841253
> other calls finished
low: 2
high: 1
dt_started: 2019-03-26 14:59:41.841253
dt_ended: 2019-03-26 14:59:41.841327
> other calls finished
low: 0
high: 1
dt_started: 2019-03-26 14:59:41.840875
dt_ended: 2019-03-26 14:59:41.841370
Out[5]: 0.000495
As you can see, the dt_started at the end of the calls is the same as the first one. It retains the value it had at the first call and the total computed time will be correct.
Related
I have a question about the quickSort algorithm on the GeeksForGeeks website here: https://www.geeksforgeeks.org/python-program-for-quicksort/
The quickSort consists of the partition function shown on GeeksForGeeks as follows:
def partition(arr, low, high):
i = (low-1) # index of smaller element
pivot = arr[high] # pivot
for j in range(low, high):
# If current element is smaller than or
# equal to pivot
if arr[j] <= pivot:
# increment index of smaller element
i = i+1
arr[i], arr[j] = arr[j], arr[i]
arr[i+1], arr[high] = arr[high], arr[i+1]
return (i+1)
I am wondering why i is set to i = low - 1.
Why can't the function be rewritten like this (Notice all the i's):
def partition(arr, low, high):
i = low
pivot = arr[high]
for j in range(low, high):
if arr[j] <= pivot:
arr[i], arr[j] = arr[j], arr[i]
i += 1
arr[i], arr[high] = arr[high], arr[i]
return i
Your implementation of quicksort works. There are multiple correct implementations of quicksort out there. My guess is that GeeksForGeeks chose this implementation, but as to why they chose it?
You'd have to ask the writer of the article.
I think your question brings up a good point, that is, algorithms can be implemented in different, but similar ways.
I quickly wrote a script to test your version of the quicksort implementation. See it below.
def partition(arr, low, high):
i = low # index of smaller element
pivot = arr[high] # pivot
for j in range(low, high):
# If current element is smaller than or
# equal to pivot
if arr[j] <= pivot:
# increment index of smaller element
arr[i], arr[j] = arr[j], arr[i]
i = i+1
arr[i], arr[high] = arr[high], arr[i]
return i
def quickSort(arr, low, high):
if len(arr) == 1:
return arr
if low < high:
# pi is partitioning index, arr[p] is now
# at right place
pi = partition(arr, low, high)
# Separately sort elements before
# partition and after partition
quickSort(arr, low, pi-1)
quickSort(arr, pi+1, high)
# Driver code to test above
arr = []
for i in range(0, 10000):
import random
r = random.randint(-100000,100000)
arr.append(r)
n = len(arr)
quickSort(arr, 0, n-1)
# using naive method to
# check sorted list
flag = 0
i = 1
while i < len(arr):
if(arr[i] < arr[i - 1]):
flag = 1
i += 1
# printing result
if (not flag) :
print ("Sorted")
else :
print ("Not sorted")
print(arr)
Its all about efficiency, by starting i at low and starting j at low, you are testing a value against itself in the first run of the loop, which does absolutely nothing in the scope of sorting an array.
To change that and keep the efficiency, you would have to change the implementation of the quickSort method (as in the one that calls partition) but by doing that you end up touching the same index multiple times between the two recursive calls, which decreases efficiency again.
Quicksort is all about speed (hence the name). The changes you made, definitely don't break the algorithm, however when you scale up the input it does decrease it's speed.
I recently came across the QuickSort algorithm and found an example for it in Python on Geeksforgeeks here: https://www.geeksforgeeks.org/python-program-for-quicksort/
My question is this: the variable arr is defined outside of the function Quicksort.. so how is the variable known both in and out of the function without global or return?
Sorry if this was basically posted elsewhere or if posting other people's code is a no-no. Not trying to claim this as mine, I just haven't seen Python code do this before. It's not the algorithm itself nor the use of recursive functions.. it's the lack of global or return that confuses me.
def Partition(arr, low, high):
i= low - 1
pivot= arr[high]
for j in range(low, high):
if arr[j] <= pivot:
i += 1
arr[i], arr[j] = arr[j], arr[i]
arr[i+1], arr[high] = arr[high], arr[i+1]
return i+1
def QuickSort(arr, low, high):
if len(arr) == 1:
return arr
if low < high:
pi= Partition(arr, low, high)
QuickSort(arr, low, pi-1)
QuickSort(arr, pi+1, high)
arr = [10, 7, 8, 9, 1, 5]
n= len(arr)
QuickSort(arr, 0, n-1)
print("Sorted array is:")
for i in range(n):
print("%d" % arr[i])
arr is being passed in as an argument here. It is misleading because the variable name in the function is the same as the global one, though it doesn't need to be. For instance, the following code is the same as what you posted:
def Partition(mylst, low, high):
i= low - 1
pivot= mylst[high]
for j in range(low, high):
if mylst[j] <= pivot:
i += 1
mylst[i], mylst[j] = mylst[j], mylst[i]
mylst[i+1], mylst[high] = mylst[high], mylst[i+1]
return i+1
def QuickSort(mylst, low, high):
if len(mylst) == 1:
return mylst
if low < high:
pi= Partition(mylst, low, high)
QuickSort(mylst, low, pi-1)
QuickSort(mylst, pi+1, high)
arr = [10, 7, 8, 9, 1, 5]
n= len(arr)
QuickSort(arr, 0, n-1)
print("Sorted array is:")
for i in range(n):
print("%d" % arr[i])
Even aside from that, as long as you are not changing what a variable is, you can still modify global values in Python. The distinction here is that you're not changing the value of arr, but the value of the actual array it is pointing to, so even if you wanted to modify the global array you would not need the global keyword.
Stuff like this is usually not done often in practice so don't worry too much about it. In this scenario at least, the array is being passed into the function, and the Quicksort function directly modifies the array that is passed into it (and it just so happens to be arr in this example).
I've made a well-known Quick-Sort function and I want to implement counter of changes and compares, but I can't figure it how. I know how to pass variable within two function, but I have no idea how it works when I do it in recursive function.
def Partition(array, low, high):
pivot = array[high]
i = low - 1
for j in range(low, high):
if array[j] <= pivot:
i += 1
array[i], array[j] = array[j], array[i]
array[i + 1], array[high] = array[high], array[i + 1]
return i + 1
def QuickSort(array, low, high):
if low < high:
pivot = Partition(array, low, high)
QuickSort(array, low, pivot - 1)
QuickSort(array, pivot + 1, high)
return array
print(QuickSort(array, 0, len(array) - 1))
Here is an example how I was passing the variable within two different function
def test2(changes, compares):
changes += 1
compares += 1
return changes, compares
def test1(changes = 0, compares = 0):
changes, compares = test2(changes, compares)
return changes, compares
I am trying to follow the Cormen's algorithm approach to solving maximum sum sub array problem using Dynamic programming in Python.For this I have already created a maximum crossing sub array code which is working fine.
def maxcrosssub(arr, low, mid, high):
left_sum = float("-inf")
sum = 0
for i in range(mid, low-1, -1):
sum += arr[i]
if sum > left_sum:
left_sum = sum
max_left = i
right_sum = float("-inf")
sum = 0
for j in range(mid+1, high):
sum += arr[j]
if sum > right_sum:
right_sum = sum
max_right = j
return(max_left, max_right, left_sum+right_sum)
But the problem is in main program.
def max_subarray(arr, low, high):
if low == high:
return (low, high, arr[low])
mid = (low+high)//2
left_low, left_high, left_sum = max_subarray(arr, low, mid)
right_low, right_high, right_sum = max_subarray(arr, mid+1, high)
cross_low, cross_high, cross_sum = maxcrosssub(arr, low, mid, high)
if left_sum >= right_sum and left_sum >= cross_sum:
return(left_low, left_high, left_sum)
elif right_sum >= left_sum and right_sum >= cross_sum:
return(right_low, right_high, right_sum)
else:
return(cross_low, cross_high, cross_sum)
I am getting this error
UnboundLocalError: local variable 'max_right' referenced before assignment.
I have tried using global variable name following some answers in stack overflow but it is still not working.
Can anyone suggest what am i doing wrong?
The problem
The problem is coming from the first function maxcrosssub, precisely from the fact that in some cases max_right is used (referenced) before being initialized (assignment). For example if the condition sum > left_sum is never fulfilled.
Solution
Assign a value to max_right in the beginning (before it's referenced)
Try this
I am trying to follow the Cormen's algorithm approach to solving maximum sum sub array problem using Dynamic programming in Python.For this I have already created a maximum crossing sub array code which is working fine.
def maxcrosssub(arr, low, mid, high):
left_sum = float("-inf")
sum = 0
# Here is where you can assign a value to 'max_right'
max_right = 0 # For example
for i in range(mid, low-1, -1):
sum += arr[i]
if sum > left_sum:
left_sum = sum
max_left = i
right_sum = float("-inf")
sum = 0
for j in range(mid+1, high):
sum += arr[j]
if sum > right_sum:
right_sum = sum
max_right = j
return(max_left, max_right, left_sum+right_sum)
def rwSteps(start, low, hi):
n=0
while low <= start <= hi:
print (start-low-1)*" " + "#" + (hi-start)*" ", n
start+=random.choice((-1,1))
n+=1
return "%d steps" % (n-1)
print rwSteps(10, 5, 15)
The above function is the function that I need to rewrite in a recursive fashion. The function takes in a starting point integer, and a low and a high point. From the starting point, the function should either do +1 or -1 from the starting point randomly until either the high limit or the low limit is reached. Here is what I have so far.
def RandomWalkSteps(start, low, hi):
count = 0
count = count + 1
if(low <= start <= hi):
count = count + 1
start+=random.choice((-1,1))
newStart = start
RandomWalkSteps(newStart, low, hi)
return count
I feel like I'm pretty close, but I'm running into trouble of where to put the "count" statement so that it increments properly at every instance of recursion. Any help would be appreciated and feel free to yell at me if I left out any crucial piece of information.
def RandomWalkSteps(start, low, hi):
if low < start < hi:
return 1 + RandomWalkSteps(random.choice((-1,1)), low, hi)
return 0
def RandomWalkSteps(start, low, hi, count=0):
if low < start < hi:
return RandomWalkSteps(start+random.choice((-1,1)), low, hi, count+1)
return count
print RandomWalkSteps(10, 5, 15)
I believe this is what you are looking for
def RandomWalkSteps(count, start, low, hi):
if low <= start <= hi:
start+=random.choice((-1,1))
newStart = start
return RandomWalkSteps(count+1, newStart, low, hi)
else:
return count
call RandomWalkSteps(0, x, y, z) instead of RandomWalkStep(x, y, z)