Python3 - mergeSort implementation

Python3 - mergeSort implementation - python

I'm trying to implement mergeSort on my own, but the order of the returned list is still not correct. I must be missing something (especially by the merge step), could someone pls help me what it is?
Here is my code:
def merge(left_half, right_half):
"""
Merge 2 sorted lists.
:param left_half: sorted list C
:param right_half: sorted list D
:return: sorted list B
"""
i = 0
j = 0
B = []
for item in range(len(left_half) + len(right_half)):
while i < len(left_half) and j < len(right_half):
if left_half[i] <= right_half[j]:
B.insert(item, left_half[i])
i += 1
else:
B.insert(item, right_half[j])
j += 1
B += left_half[i:]
B += right_half[j:]
print("result: ", B)
return B
def mergeSort(A):
"""
Input: list A of n distinct integers.
Output: list with the same integers, sorted from smallest to largest.
:return: Output
"""
# base case
if len(A) < 2:
return A
# divide the list into two
mid = len(A) // 2
print(mid)
left = A[:mid] # recursively sort first half of A
right = A[mid:] # recursively sort second half of A
x = mergeSort(left)
y = mergeSort(right)
return merge(x, y)
print(mergeSort([1, 3, 2, 4, 6, 5]))
Before the last merge I receive the two lists [1, 2, 3] and [4, 5, 6] correctly, but my final result is [3, 2, 1, 4, 5, 6].

In the first iteration of your for-loop, you entirely traverse one of the lists, but always insert at index 0.
You do not want to insert an element, you always want to append it. This then makes the for-loop unecessary.
Here is a fixed version of your code:
def merge(left_half, right_half):
"""
Merge 2 sorted arrays.
:param left_half: sorted array C
:param right_half: sorted array D
:return: sorted array B
"""
i = 0
j = 0
B = []
while i < len(left_half) and j < len(right_half):
if left_half[i] <= right_half[j]:
B.append(left_half[i])
i += 1
else:
B.append(right_half[j])
j += 1
B += left_half[i:]
B += right_half[j:]
print("result: ", B)
return B
merge([1, 2, 3], [4, 5, 6])
# result: [1, 2, 3, 4, 5, 6]

Related

Cut a sequence of length N into subsequences such that the sum of each subarray is less than M and the cut minimizes the sum of max of each part

Given an integer array sequence a_n of length N, cut the sequence into several parts such that every one of which is a consequtive subsequence of the original sequence.
Every part must satisfy the following:
The sum of each part is not greater than a given integer M
Find a cut that minimizes the sum of the maximum integer of each part
For example:
input : n = 8, m = 17 arr = [2, 2, 2, 8, 1, 8, 2, 1]
output = 12
explanation: subarrays = [2, 2, 2], [8, 1, 8], [2, 1]
sum = 2 + 8 + 2 = 12
0 <= N <= 100000
each integer is between 0 and 1000000
If no such cut exists, return -1
I believe this is a dynamic programming question, but I am not sure how to approach this.
I am relatively new to coding, and came across this question in an interview which I could not do. I would like to know how to solve it for future reference.
Heres what I tried:
n = 8
m = 17
arr = [2, 2, 2, 8, 1, 8, 2, 1]
biggest_sum, i = 0, 0
while (i < len(arr)):
seq_sum = 0
biggest_in_seq = -1
while (seq_sum <= m and i < len(arr)):
if (seq_sum + arr[i] <= m ):
seq_sum += arr[i]
if (arr[i] > biggest_in_seq):
biggest_in_seq = arr[i]
i += 1
else:
break
biggest_sum += biggest_in_seq
if (biggest_sum == 0):
print(-1)
else:
print(biggest_sum)
This givens the result 16, and the subsequences are: [[2, 2, 2, 8, 1], [8, 2, 1]]

Problem is that you are filling every sequence from left to right up to the maximum allowed value m. You should evaluate different options of sequence lengths and minimize the result, which in the example means that the 2 8 values must be in the same sequence.
a possible solution could be:
n = 8
m = 17
arr = [2, 2, 2, 8, 1, 8, 2, 1]
def find_solution(arr, m, n):
if max(arr)>m:
return -1
optimal_seq_length = [0] * n
optimal_max_sum = [0] * n
for seq_start in reversed(range(n)):
seq_len = 0
seq_sum = 0
seq_max = 0
while True:
seq_len += 1
seq_end = seq_start + seq_len
if seq_end > n:
break
last_value_in_seq = arr[seq_end - 1]
seq_sum += last_value_in_seq
if seq_sum > m:
break
seq_max = max(seq_max, last_value_in_seq)
max_sum_from_next_seq_on = 0 if seq_end >= n else optimal_max_sum[seq_end]
max_sum = max_sum_from_next_seq_on + seq_max
if seq_len == 1 or max_sum < optimal_max_sum[seq_start]:
optimal_max_sum[seq_start] = max_sum
optimal_seq_length[seq_start] = seq_len
# create solution list of lists
solution = []
seg_start = 0
while seg_start < n:
seg_length = optimal_seq_length[seg_start]
solution.append(arr[seg_start:seg_start+seg_length])
seg_start += seg_length
return solution
print(find_solution(arr, m, n))
# [[2, 2, 2], [8, 1, 8], [2, 1]]
Key aspects of my proposal:
start from a small array (only last element), and make the problem array grow to the front:
[1]
[2, 1]
[8, 2, 1]
etc.
for each of above problem arrays, store:
the optimal sum of the maximum of each sequence (optimal_max_sum), which is the value to be minimized
the sequence length of the first sequence (optimal_seq_length) to achieve this optimal value
do this by: for each allowed sequence length starting at the beginning of the problem array:
calculate the new max_sum value and add it to previously calculated optimal_max_sum for the part after this sequence
keep the smallest max_sum, store it in optimal_max_sum and the associated seq_length in optimal_seq_length

Comparing two lists and making new list

So lets say I have two lists a=[1,2,3,4,5,6] and b=[2,34,5,67,5,6] I want to create a third list which will have 1 where elements are different in a and b and 0 when they are same, so above would be like c=[1,1,1,1,0,0]

You can zip the lists and compare them in a list comprehension. This takes advantage of the fact that booleans are equivalent to 1 and 0 in python:
a=[1,2,3,4,5,6]
b=[2,34,5,67,5,6]
[int(m!=n) for m, n, in zip(a, b)]
# [1, 1, 1, 1, 0, 0]

Try a list comprehension over elements of each pair of items in the list with zip:
[ 0 if i == j else 1 for i,j in zip(a,b) ]

Iterating with a for loop is an option, though list comprehension may be more efficient.
a=[1,2,3,4,5,6]
b=[2,34,5,67,5,6]
c=[]
for i in range(len(a)):
if a[i] == b[i]:
c.append(0)
else:
c.append(1)
print(c)
prints
[1, 1, 1, 1, 0, 0]

If you will have multiple vector operations and they should be fast. Checkout numpy.
import numpy as np
a=[1,2,3,4,5,6]
b=[2,34,5,67,5,6]
a = np.array(a)
b = np.array(b)
c = (a != b).astype(int)
# array([1, 1, 1, 1, 0, 0])

idk if this is exactly what youre loocking for but this should work:
edidt: just found out that Joe Thor commented almost the exact same a few minutes earlier than me lmao
a = [1, 2, 3, 4, 5, 6]
b = [2, 34, 5, 67, 5, 6]
results = []
for f in range(0, len(a)):
if a[f] == b[f]:
results.append(0)
else:
results.append(1)
print(results)

This can be done fairly simply using a for loop. It does assume that both lists, a and b, are the same length. An example code would like something like this:
a = [1,2,3,4,5,6]
b = [2,34,5,67,5,6]
c = []
if len(a) == len(b):
for i in range(0,len(a)):
if(a[i] != b[i]):
c.append(1)
else:
c.append(0)
This can also be done using list comprehension:
a = [1,2,3,4,5,6]
b = [2,34,5,67,5,6]
c = []
if len(a) == len(b):
c = [int(i != j) for i,j in zip(a,b)]
The list comprehension code is from this thread: Comparing values in two lists in Python

a = [1, 2, 3, 4, 5, 6]
b = [2, 34, 5, 67, 5,6]
c = []
index = 0
x = 1
y = 0
for i in range(len(a)): # iterating loop from index 0 till the last
if a[index]!= b[index]: # comapring each index
c.append(x) # if not equal append c with '1'
index += 1 # increment index to move to next index in both lists
else:
c.append(y)
index += 1
print(c)

This should work for two lists of any type.
tstlist = ["w","s","u"]
lstseasons = ["s","u","a","w"]
lstbool_Seasons = [1 if ele in tstlist else 0 for ele in lstseasons]
Output: lstbool_Seasons = [1,1,0,1]
This is the first time I have posted anything, still figuring out how things work here, so please forgive faux pas...

list index out of range in a merge and sort function

I tried writing a simple merge and sort function in python and got stuck after getting the following error-
List out of range.
I would appreciate if you could help me fix it and figure out how to avoid it. I have added the code below-
def merge(lst1, lst2):
# Gets two sorted lists and returns one merged and sorted list
merge_sorted = []
i = 0
j = 0
len1 = len(lst1) - 1
len2 = len(lst2) - 1
while i < len1 or j < len2:
if lst1[i] < lst2[j]:
merge_sorted.append(lst1[i])
i += 1
elif lst1[i] > lst2[j]:
merge_sorted.append(lst2[j])
j += 1
else:
merge_sorted.append(lst1[i])
merge_sorted.append(lst2[j])
i += 1
j += 1
return merge_sorted
lst1 = [2, 4, 5, 6, 8]
lst2 = [1, 3, 7, 9, 0]
merge(lst1, lst2)
What I got:
IndexError Traceback (most recent call last)
<ipython-input-13-572aad47097b> in <module>()
22 lst1 = [2, 4, 5, 6, 8]
23 lst2 = [1, 3, 7, 9, 0]
---> 24 merge(lst1, lst2)
<ipython-input-13-572aad47097b> in merge(lst1, lst2)
7 len2 = len(lst2) - 1
8 while i < len1 or j < len2:
----> 9 if lst1[i] < lst2[j]:
10 merge_sorted.append(lst1[i])
11 i += 1
IndexError: list index out of range

Your problem is the while condition:
while i < len1 or j < len2:
it should be and - if either of the conditoins are not true, you simple append the remainder of the non-empty list to your result and you are done.
Your current code still enters the while-body and checks if lst1[i] < lst2[j]: if one of the i / j is bigger then the list you get the error you have.
The full fixed code:
def merge(lst1, lst2):
# Gets two sorted lists and returns one merged and sorted list
merge_sorted = []
i = 0
j = 0
len1 = len(lst1) - 1
len2 = len(lst2) - 1
while i < len1 and j < len2: # use and
if lst1[i] < lst2[j]:
merge_sorted.append(lst1[i])
i += 1
elif lst1[i] > lst2[j]:
merge_sorted.append(lst2[j])
j += 1
else:
merge_sorted.append(lst1[i])
merge_sorted.append(lst2[j])
i += 1
j += 1
# add remainder lists - the slices evaluate to [] if behind the list lengths
merge_sorted.extend(lst1[i:]) # if i is aready out of the list this is []
merge_sorted.extend(lst2[j:]) # if j is aready out of the list this is []
return merge_sorted
lst1 = [2, 4, 5, 6, 8]
lst2 = [0, 1, 3, 7, 9] # fixed input, needs to be sorted, yours was not
print(merge(lst1, lst2))
Output:
[0, 1, 2, 3, 4, 5, 6, 8, 7, 9]

As suggested by other techies you can modify and run the program but you are simply increasing the time complexity of your program which you could have done in two lines.
Just extend the list1 elements like
list1.extend(list2)
once the elements are into the list1
print(set(sorted(list1)))

First of all, Your logic is wrong! You are picking the lower numbers and putting them into the list. but what about the biggest number of all? You will be stuck there! Because you will never pick the last one!
I changed the logic. Instead of counting up the iterators, I removed the picked ones! and when one list got empty the rest of the other one will join the final list.
and secondly, don't use the "merge" name for your function! It's occupied!
def merger(l1, l2):
merge_sorted = []
t1, t2 = sorted(l1), sorted(l2)
while len(t1) != 0 and len(t2) != 0:
if t1[0] <= t2[0]:
merge_sorted.append(t1[0])
t1 = t1[1:]
else:
merge_sorted.append(t2[0])
t2 = t2[1:]
return merge_sorted + (t1 if len(t1) != 0 else t2)
lst2 = [2, 4, 5, 6, 8]
lst1 = [1, 3, 7, 9, 0, 10]
print(merger(lst1, lst2))

Here are the values for i, j just before that if condition-
0 0
0 1
1 1
1 2
2 2
3 2
4 2
4 3
5 3
When any of the lists is traversed till the end, it throws index out of range error.
Solution-
Instead of using or condition, use and condition and append the remaining list elements at the end of the sorted list.

Largest Subset whose sum is less than equal to a given sum

A list is defined as follows: [1, 2, 3]
and the sub-lists of this are:
[1], [2], [3],
[1,2]
[1,3]
[2,3]
[1,2,3]
Given K for example 3 the task is to find the largest length of sublist with sum of elements is less than equal to k.
I am aware of itertools in python but it will result in segmentation fault for larger lists. Is there any other efficient algorithm to achieve this? Any help would be appreciated.
My code is as allows:
from itertools import combinations
def maxLength(a, k):
#print a,k
l= []
i = len(a)
while(i>=0):
lst= list(combinations(sorted(a),i))
for j in lst:
#rint list(j)
lst = list(j)
#print sum(lst)
sum1=0
sum1 = sum(lst)
if sum1<=k:
return len(lst)
i=i-1

You can use the dynamic programming solution that #Apy linked to. Here's a Python example:
def largest_subset(items, k):
res = 0
# We can form subset with value 0 from empty set,
# items[0], items[0...1], items[0...2]
arr = [[True] * (len(items) + 1)]
for i in range(1, k + 1):
# Subset with value i can't be formed from empty set
cur = [False] * (len(items) + 1)
for j, val in enumerate(items, 1):
# cur[j] is True if we can form a set with value of i from
# items[0...j-1]
# There are two possibilities
# - Set can be formed already without even considering item[j-1]
# - There is a subset with value i - val formed from items[0...j-2]
cur[j] = cur[j-1] or ((i >= val) and arr[i-val][j-1])
if cur[-1]:
# If subset with value of i can be formed store
# it as current result
res = i
arr.append(cur)
return res
ITEMS = [5, 4, 1]
for i in range(sum(ITEMS) + 1):
print('{} -> {}'.format(i, largest_subset(ITEMS, i)))
Output:
0 -> 0
1 -> 1
2 -> 1
3 -> 1
4 -> 4
5 -> 5
6 -> 6
7 -> 6
8 -> 6
9 -> 9
10 -> 10
In above arr[i][j] is True if set with value of i can be chosen from items[0...j-1]. Naturally arr[0] contains only True values since empty set can be chosen. Similarly for all the successive rows the first cell is False since there can't be empty set with non-zero value.
For rest of the cells there are two options:
If there already is a subset with value of i even without considering item[j-1] the value is True
If there is a subset with value of i - items[j - 1] then we can add item to it and have a subset with value of i.

As far as I can see (since you treat sub array as any items of the initial array) you can use greedy algorithm with O(N*log(N)) complexity (you have to sort the array):
1. Assign entire array to the sub array
2. If sum(sub array) <= k then stop and return sub array
3. Remove maximim item from the sub array
4. goto 2
Example
[1, 2, 3, 5, 10, 25]
k = 12
Solution
sub array = [1, 2, 3, 5, 10, 25], sum = 46 > 12, remove 25
sub array = [1, 2, 3, 5, 10], sum = 21 > 12, remove 10
sub array = [1, 2, 3, 5], sum = 11 <= 12, stop and return
As an alternative you can start with an empty sub array and add up items from minimum to maximum while sum is less or equal then k:
sub array = [], sum = 0 <= 12, add 1
sub array = [1], sum = 1 <= 12, add 2
sub array = [1, 2], sum = 3 <= 12, add 3
sub array = [1, 2, 3], sum = 6 <= 12, add 5
sub array = [1, 2, 3, 5], sum = 11 <= 12, add 10
sub array = [1, 2, 3, 5, 10], sum = 21 > 12, stop,
return prior one: [1, 2, 3, 5]

Look, for generating the power-set it takes O(2^n) time. It's pretty bad. You can instead use the dynamic programming approach.
Check in here for the algorithm.
http://www.geeksforgeeks.org/dynamic-programming-subset-sum-problem/
And yes, https://www.youtube.com/watch?v=s6FhG--P7z0 (Tushar explains everything well) :D

Assume everything is positive. (Handling negatives is a simple extension of this and is left to the reader as an exercise). There exists an O(n) algorithm for the described problem. Using the O(n) median select, we partition the array based on the median. We find the sum of the left side. If that is greater than k, then we cannot take all elements, we must thus recur on the left half to try to take a smaller set. Otherwise, we subtract the sum of the left half from k, then we recur on the right half to see how many more elements we can take.
Partitioning the array based on median select and recurring on only 1 of the halves yields a runtime of n+n/2 +n/4 +n/8.. which geometrically sums up to O(n).

Sum of specific elements in a list, if they are consecutive (python)

What question is asking for is, from a list of lists like the following, to return a tuple that contains tuples of all occurrences of the number 2 for a given index on the list. If there are X consecutive 2s, then it should appear only one element in the inside tuple containing X, just like this:
[[1, 2, 2, 1],
[2, 1, 1, 2],
[1, 1, 2, 2]]
Gives
((1,), (1,), (1, 1),(2,))
While
[[2, 2, 2, 2],
[2, 1, 2, 2],
[2, 2, 1, 2]]
Gives
((3,),(1, 1),(2,)(3,))
What about the same thing but not for the columns, this time, for the rows? is there a "one-line" method to do it? I mean:
[[1, 2, 2, 1],
[2, 1, 1, 2],
[1, 1, 2, 2]]
Gives
((2,), (1, 1), (2,))
While
[[2, 2, 2, 2],
[2, 1, 2, 2],
[2, 2, 1, 2]]
Gives
((4,),(1, 2),(2, 1))
I have tried some things, this is one of the things, I can't finish it, don't know what to do anymore, after it:
l = [[2,2,2],[2,2,2],[2,2,2]]
t = (((1,1),(2,),(2,)),((2,),(2,),(1,1)))
if [x.count(0) for x in l] == [0 for x in l]:
espf = []*len(l)
espf2 = []
espf_atual = 0
contador = 0
for x in l:
for c in x:
celula = x[c]
if celula == 2:
espf_atual += 1
else:
if celula == 1:
espf[contador] = [espf_atual]
contador += 1
espf_atual = 0
espf2 += [espf_atual]
espf_atual = 0
print(tuple(espf2))
output
(3, 3, 3)
this output is the correct one but if I change the list(l) it doesn't work

So, you have som emistakes in the code.
Indexing:
for c in x:
celula = x[c]
It should be celula = c as c already points to each element of x.
Intermediate results
For each column you store intermediate results as:
espf_atual = 0
...
espf_atual += 1
...
espf2 += [espf_atual]
but this will only allow to store the last occurrences of 2 for each column. This is, if a row is [2,1,2,2], then espf_actual = 2 and you will store only the last occurrence. You will override the first occurrence (before the 1).
To avoid this, you need to store intermediate results for each row. You got it halfway with espf = []*len(l), but you never used it properly later.
Find bellow a working example (not too different from your initial solution):
espf = []
for x in l:
# Restart counters for every row
espf_current = [] # Will store any sequences of 2
contador = 0 # Will count consecutive 2's
for c in x:
celula = c
if celula == 2:
contador += 1 # Count number of 2
elif celula == 1:
if contador > 0: # Store any 2 before 1
espf_current += [contador]
contador = 0
if contador > 0: # Check if the row ends in 2
espf_current += [contador]
# Store results of this row in the final results
espf += [tuple(espf_current)]
print tuple(espf)
The key to switch rows and columns, is to change the indexing method. Currently you are iterating along the elements of the list, and thus, this doesn't allow you to switch between rows and columns.
Another way to see the iteration is to iterate indexes of the matrix (i, j for rows and columns) as follows:
numRows = len(l)
numCols = len(l[0])
for i in range(numRows):
for j in range(numCols):
celula = l[i][j]
The above is the indexing equivalent to the previous code. It assumes all the rows have the same length (which is true in your examples). Changing it from rows to columns is straightforward (tip: switch the loops), I leave it to you :P

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python3 - mergeSort implementation - python

Related

Cut a sequence of length N into subsequences such that the sum of each subarray is less than M and the cut minimizes the sum of max of each part

Comparing two lists and making new list

list index out of range in a merge and sort function

Largest Subset whose sum is less than equal to a given sum

Sum of specific elements in a list, if they are consecutive (python)

Categories

Resources