mergesort recursive version intutition behind - python

I have found the following part of a program of mergesort in a book:
def sort(v):
if len(v)<=1:
return v
mid=len(v)//2
v1,v2=sort(v[:mid]),sort(v[mid:])
return merge(v1,v2)
The part of merge what it does is to compare each element of v1 and v2 and make a swap between them if its necessary. The question that I have is in relation with the sort() function. For example, if I pass a list like: [5,2,4,8,6,3]. It will get divided in chunks, call the sort() function recursively, but I do not find at which point does it call the merge() function. So, is it fine if I suppose that the set of calls performed, for the lower half, is like this:
sort([5,4,2])=v1 sort([8,6,3])=v2
(at this point is called merge(v1,v2) or does it wait to the list to be exhausted?)
sort([5])=v1 sort ([4,2])=v2
(because the length of v1 is less than 1 then returns v which is [5], in this part I do not know how it gets joined with v2)
v[5] sort(v[4])=v1 sort(v[2]))
(v[5] has been returned and the right part gets ordered so we will have v=[2,4])
in the last part I just do not know if I should call merge with v[5] and with v=[2,4] to make the ordering, is it like that? or am I missing something?
Any help or how to correctly interpret this source code?
Thanks

To demonstrate how mergesort works, I present my own implementation I wrote a while back:
def mergesort(lst):
# SORT PART ------------------------------------------------
# base case: return just this list if length = 1
if len(lst) <= 1:
return lst
# recursive case: do mergesort() on either half of the list
mid = len(lst) // 2
sub1, sub2 = mergesort(lst[:mid]), mergesort(lst[mid:])
# MERGE PART ------------------------------------------------
# merge sub1 and sub2, which are each sorted
sorted_lst = []
while sub1 and sub2: # ...are not empty...
# remove the lesser element from the front of sub1 or sub2 and add it to sorted list
sorted_lst.append(sub1.pop(0) if sub1[0] < sub2[0] else sub2.pop(0))
# finally, once one of the lists are empty, append the remainder of the other list.
sorted_lst += (sub1 if sub1 else sub2)
# and return the now-sorted list
return sorted_lst
Essentially, mergesort splits the list in half, repeatedly, until it gets to singleton lists. At which point it puts the lower element before the higher element and returns that.
Then, the next level up considers the two lists it got back, and treats them both as priority queues - removes the lowest element between them, then the next-lowest element between them, etc. Said lowest element is always at the front, because the lower recursive layers made it that way.

In a top down merge sort, merging does not begin until two base cases where sub-array size has been reduced to a single element occurs. After that the merging and splitting continue up and down the call chain, depth first and usually left first.
With the questions example code, recursion will repeatedly follow the left path of sort(v[:mid]) until a base case one element is reached before that instance returns to allow the second call sort(v[mid:]), which could be two elements in which case one more level of recursion occurs and then merging begins.

Related

Merge sorting algorithm in Python for two sorted lists - trouble constructing for-loop

I'm trying to create an algorithm to merge two ordered lists into a larger ordered list in Python. Essentially I began by trying to isolate the minimum elements in each list and then I compared them to see which was smallest, because that number would be smallest in the larger list as well. I then appended that element to the empty larger list, and then deleted it from the original list it came from. I then tried to loop through the original two lists doing the same thing. Inside the "if" statements, I've essentially tried to program the function to append the remainder of one list to the larger function if the other is/becomes empty, because there would be no point in asking which elements between the two lists are comparatively smaller then.
def merge_cabs(cab1, cab2):
for (i <= all(j) for j in cab1):
for (k <= all(l) for l in cab2):
if cab1 == []:
newcab.append(cab2)
if cab2 == []:
newcab.append(cab1)
else:
k = min(min(cab1), min(cab2))
newcab.append(k)
if min(cab1) < min(cab2):
cab1.remove(min(cab1))
if min(cab2) < min(cab1):
cab2.remove(min(cab2))
print(newcab)
cab1 = [1,2,5,6,8,9]
cab2 = [3,4,7,10,11]
newcab = []
merge_cabs(cab1, cab2)
I've had a bit of trouble constructing the for-loop unfortunately. One way I've tried to isolate the minimum values was as I wrote in the two "for" lines. Right now, Python is returning "SyntaxError: invalid syntax," pointing to the colon in the first "for" line. Another way I've tried to construct the for-loop was like this:
def merge_cabs(cabs1, cabs2):
for min(i) in cab1:
for min(j) in cab2:
I've also tried to write the expression all in one line like this:
def merge_cabs(cab1, cab2):
for min(i) in cabs1 and min(j) in cabs2:
and to loop through a copy of the original lists rather than looping through the lists themselves, because searching through the site, I've found that it can sometimes be difficult to remove elements from a list you're looping through. I've also tried to protect the expressions after the "for" statements inside various configurations of parentheses. If someone sees where the problem(s) lies, it would really be great if you could point it out, or if you have any other observations that could help me better construct this function, I would really appreciate those too.
Here's a very simple-minded solution to this that uses only very basic Python operations:
def merge_cabs(cab1, cab2):
len1 = len(cab1)
len2 = len(cab2)
i = 0
j = 0
newcab = []
while i < len1 and j < len2:
v1 = cab1[i]
v2 = cab2[j]
if v1 <= v2:
newcab.append(v1)
i += 1
else:
newcab.append(v2)
j += 1
while i < len1:
newcab.append(cab1[i])
i += 1
while j < len2:
newcab.append(cab2[j])
j += 1
return newcab
Things to keep in mind:
You should not have any nested loops. Merging two sorted lists is typically used to implement a merge sort, and the merge step should be linear. I.e., the algorithm should be O(n).
You need to walk both lists together, choosing the smallest value at east step, and advancing only the list that contains the smallest value. When one of the lists is consumed, the remaining elements from the unconsumed list are simply appended in order.
You should not be calling min or max etc. in your loop, since that will effectively introduce a nested loop, turning the merge into an O(n**2) algorithm, which ignores the fact that the lists are known to be sorted.
Similarly, you should not be calling any external sort function to do the merge, since that will result in an O(n*log(n)) merge (or worse, depending on the sort algorithm), and again ignores the fact that the lists are known to be sorted.
Firstly, there's a function in the (standard library) heapq module for doing exactly this, heapq.merge; if this is a real problem (rather than an exercise), you want to use that one instead.
If this is an exercise, there are a couple of points:
You'll need to use a while loop rather than a for loop:
while cab1 or cab2:
This will keep repeating the body while there are any items in either of your source lists.
You probably shouldn't delete items from the source lists; that's a relatively expensive operation. In addition, on the balance having a merge_lists function destroy its arguments would be unexpected.
Within the loop you'll refer to cab1[i1] and cab2[i2] (and, in the condition, to i1 < len(cab1)).
(By the time I typed out the explanation, Tom Karzes typed out the corresponding code in another answer...)

Swap elements python

Look at successive pair of elements in a list, and swaps them if they are out of order (possibly swapping a number more than once).
I have tried to use for loops, etc., but am unable to solve the problem.
deleted
I need to use functions rather than any python library. I can solve this using one (already have!) but I need to use low level beginner methods.
ex: bubble([2,1,4,3]) == [1,2,3,4]
You could use the indices of the elements in order to swap them:
def swap(seq, idx, jdx):
"""swaps the two elements of the sequence, identified by their indices
in-place, mutates seq
return: None
"""
seq[idx], seq[jdx] = seq[jdx], seq[idx]
The one liner that swaps the values creates a tuple of values on the right hand side, and unpacks it (assigns each values to a variable) on the left hand side.
value_list =[4,3,2,1]
for a in range(len(value_list)):
for b in range(len(value_list)):
if value_list[b] > value_list[a]:
value_list[b],value_list[a]=value_list[a],value_list[b]
print(value_list)
you can use this within function like:
value_list =[4,3,2,1]
def sort_list(given_list):
for a in range(len(given_list)):
for b in range(len(given_list)):
if given_list[b] > given_list[a]:
given_list[b],given_list[a]=given_list[a],given_list[b]
return given_list
print(sort_list(value_list))

Python: selectionsort algorithm with queues

I have come across an exercise in Python:
Read in some strings and put them into a queue
Sort the strings lexicographically into a new queue, but the original queue shouldn't be changed. I should write a function from scratch (e.g. the sorted function cannot be used)
The use of arrays is not allowed
I think I have managed to come up with a function for step 1, but I have been struggling with step 2 for hours. I would really appreciate any kind of help!
Here is my code snippet for step 1:
q1 = []
def DisplayQueue(queue):
for Item in queue:
print(Item)
def PushQueue(queue):
x = True
while x:
user_input = input("Please enter a string (for exit type: exit): ")
if user_input == "exit":
x = False
else:
queue.append(user_input)
return queue
queue = PushQueue(q1)
Sorting can be done in many ways (bubble sort, insert sort, quick sort radix sort), but I recommend you start with something simple, although not the fastest.
Create a queue (or list) to store the answer in.
Find the smallest element in the old data list. *
Remove that element from the old list. **
Add than element to the answer list.
Repeat from step 2 (if the data list is not empty).
The data list will get shorter and shorter for each turn,
and the elements will be added in increasing order to the answer list.
*) To find the smallest element in the data list (you could put that in a separate function), keep the value of the currently smallest value in a variable called x or something, and go through the data items one by one. If the data item is smaller than what you have in the variable x, then put the value of that data item into x.
Now, once you have gone through the whole data list, your variable x will contain the value of the smallest element in the list.
**) You can remove a value v from a list x with x.remove(v). It just removes the first occurence of that value.

Python lists - codes & algorithem

I need some help with python, a new program language to me.
So, lets say that I have this list:
list= [3, 1, 4, 9, 8, 2]
And I would like to sort it, but without using the built-in function "sort", otherwise where's all the fun and the studying in here? I want to code as simple and as basic as I can, even if it means to work a bit harder. Therefore, if you want to help me and to offer me some of ideas and code, please, try to keep them very "basic".
Anyway, back to my problem: In order to sort this list, I've decided to compare every time a number from the list to the last number. First, I'll check 3 and 2. If 3 is smaller than 2 (and it's false, wrong), then do nothing.
Next - check if 1 is smaller than 2 (and it's true) - then change the index place of this number with the first element.
On the next run, it will check again if the number is smaller or not from the last number in the list. But this time, if the number is smaller, it will change the place with the second number (and on the third run with the third number, if it's smaller, of course).
and so on and so on.
In the end, the ()function will return the sorted list.
Hop you've understand it.
So I want to use a ()recursive function to make the task bit interesting, but still basic.
Therefore, I thought about this code:
def func(list):
if not list:
for i in range(len(list)):
if list[-1] > lst[i]:
#have no idea what to write here in order to change the locations
i = i + 1
#return func(lst[i+1:])?
return list
2 questions:
1. How can I change the locations? Using pop/remove and then insert?
2. I don't know where to put the recursive part and if I've wrote it good (I think I didn't). the recursive part is the second "#", the first "return".
What do you think? How can I improve this code? What's wrong?
Thanks a lot!
Oh man, sorting. That's one of the most popular problems in programming with many, many solutions that differ a little in every language. Anyway, the most straight-forward algorithm is I guess the bubble sort. However, it's not very effective, so it's mostly used for educational purposes. If you want to try something more efficient and common go for the quick sort. I believe it's the most popular sorting algorithm. In python however, the default algorithm is a bit different - read here. And like I've said, there are many, many more sorting algorithms around the web.
Now, to answer your specific questions: in python replacing an item in a list is as simple as
list[-1]=list[i]
or
tmp=list[-1]
list[-1]=list[i]
list[i]=tmp
As to recursion - I don't think it's a good idea to use it, a simple while/for loop is better here.
maybe you can try a quicksort this way :
def quicksort(array, up, down):
# start sorting in your array from down to up :
# is array[up] < array[down] ? if yes switch
# do it until up <= down
# call recursively quicksort
# with the array, middle, up
# with the array, down, middle
# where middle is the value found when the first sort ended
you can check this link : Quicksort on Wikipedia
It is nearly the same logic.
Hope it will help !
The easiest way to swap the two list elements is by using “parallel assignment”:
list[-1], list[i] = list[i], list[-1]
It doesn't really make sense to use recursion for this algorithm. If you call func(lst[i+1:]), that makes a copy of those elements of the list, and the recursive call operates on the copy, and then the copy is discarded. You could make func take two arguments: the list and i+1.
But your code is still broken. The not list test is incorrect, and the i = i + 1 is incorrect. What you are describing sounds a variation of selection sort where you're doing a bunch of extra swapping.
Here's how a selection sort normally works.
Find the smallest of all elements and swap it into index 0.
Find the smallest of all remaining elements (all indexes greater than 0) and swap it into index 1.
Find the smallest of all remaining elements (all indexes greater than 1) and swap it into index 2.
And so on.
To simplify, the algorithm is this: find the smallest of all remaining (unsorted) elements, and append it to the list of sorted elements. Repeat until there are no remaining unsorted elements.
We can write it in Python like this:
def func(elements):
for firstUnsortedIndex in range(len(elements)):
# elements[0:firstUnsortedIndex] are sorted
# elements[firstUnsortedIndex:] are not sorted
bestIndex = firstUnsortedIndex
for candidateIndex in range(bestIndex + 1, len(elements)):
if elements[candidateIndex] < elements[bestIndex]:
bestIndex = candidateIndex
# Now bestIndex is the index of the smallest unsorted element
elements[firstUnsortedIndex], elements[bestIndex] = elements[bestIndex], elements[firstUnsortedIndex]
# Now elements[0:firstUnsortedIndex+1] are sorted, so it's safe to increment firstUnsortedIndex
# Now all elements are sorted.
Test:
>>> testList = [3, 1, 4, 9, 8, 2]
>>> func(testList)
>>> testList
[1, 2, 3, 4, 8, 9]
If you really want to structure this so that recursion makes sense, here's how. Find the smallest element of the list. Then call func recursively, passing all the remaining elements. (Thus each recursive call passes one less element, eventually passing zero elements.) Then prepend that smallest element onto the list returned by the recursive call. Here's the code:
def func(elements):
if len(elements) == 0:
return elements
bestIndex = 0
for candidateIndex in range(1, len(elements)):
if elements[candidateIndex] < elements[bestIndex]:
bestIndex = candidateIndex
return [elements[bestIndex]] + func(elements[0:bestIndex] + elements[bestIndex + 1:])

alternative to recursion based merge sort logic

here is a merge sort logic in python : (this is the first part, ignore the function merge()) The point in question is converting the recursive logic to a while loop.
Code courtesy: Rosettacode Merge Sort
def merge_sort(m):
if len(m) <= 1:
return m
middle = len(m) / 2
left = m[:middle]
right = m[middle:]
left = merge_sort(left)
right = merge_sort(right)
return list(merge(left, right))
Is it possible to make it a sort of dynamically in the while loop while each left and right array breaks into two, a sort of pointer keeps increasing based on the number of left and right arrays and breaking them until only single length sized list remains?
because every time the next split comes while going on both left- and right- side the array keeps breaking down till only single length list remains, so the number of left sided (left-left,left-right) and right sided (right-left,right-right) breaks will increase till it reaches a list of size 1 for all.
One possible implementation might be this:
def merge_sort(m):
l = [[x] for x in m] # split each element to its own list
while len(l) > 1: # while there's merging to be done
for x in range(len(l) >> 1): # take the first len/2 lists
l[x] = merge(l[x], l.pop()) # and merge with the last len/2 lists
return l[0] if len(l) else []
Stack frames in the recursive version are used to store progressively smaller lists that need to be merged. You correctly identified that at the bottom of the stack, there's a one-element list for each element in whatever you're sorting. So, by starting from a series of one-element lists, we can iteratively build up larger, merged lists until we have a single, sorted list.
Reposted from alternative to recursion based merge sort logic at the request of a reader:
One way to eliminate recursion is to use a queue to manage the outstanding work. For example, using the built-in collections.deque:
from collections import deque
from heapq import merge
def merge_sorted(iterable):
"""Return a list consisting of the sorted elements of 'iterable'."""
queue = deque([i] for i in iterable)
if not queue:
return []
while len(queue) > 1:
queue.append(list(merge(queue.popleft(), queue.popleft())))
return queue[0]
It's said, that every recursive function can be written in a non-recursive manner, so the short answer is: yes, it's possible. The only solution I can think of is to use the stack-based approach. When recursive function invokes itself, it puts some context (its arguments and return address) on the inner stack, which isn't available for you. Basically, what you need to do in order to eliminate recursion is to write your own stack and every time when you would make a recursive call, put the arguments onto this stack.
For more information you can read this article, or refer to the section named 'Eliminating Recursion' in Robert Lafore's "Data Structures and Algorithms in Java" (although all the examples in this book are given in Java, it's pretty easy to grasp the main idea).
Going with Dan's solution above and taking the advice on pop, still I tried eliminating while and other not so pythonic approach. Here is a solution that I have suggested:
PS: l = len
My doubt on Dans solution is what if L.pop() and L[x] are same and a conflict is created, as in the case of an odd range after iterating over half of the length of L?
def merge_sort(m):
L = [[x] for x in m] # split each element to its own list
for x in xrange(l(L)):
if x > 0:
L[x] = merge(L[x-1], L[x])
return L[-1]
This can go on for all academic discussions but I got my answer to an alternative to recursive method.

Categories