Merging two sorted arrays in python

Merging two sorted arrays in python - python

I am trying to merge two sorted arrays recursively, and I can merge the first few numbers until one pointer exits the array. There seems to be some problem with the base case not getting executed. I have tried to print the new_arr with the pointers for each recursive call to debug but cannot seem to find a solution. Here is my code:
new_arr= []
i= 0
j=0
def merge(arr1, arr2, i, j):
#base case
##when arr1 pointer exits
print(i,j, new_arr)
if(i>len(arr1)-1):
new_arr.append(arr2[j:])
return new_arr
##when arr2 pointer exits
if (j > len(arr2)-1):
new_arr.append(arr1[i:])
return new_arr
if(arr1[i]<arr2[j]):
new_arr.append(arr1[i])
i+=1
merge(arr1, arr2, i, j)
elif(arr1[i]>=arr2[j]):
new_arr.append(arr2[j])
j+=1
merge(arr1, arr2, i, j)
sortedarr = merge([1,9], [3,7,11,14,18,99], i, j)
print(sortedarr)
and here goes my output:
0 0 []
1 0 [1]
1 1 [1, 3]
1 2 [1, 3, 7]
2 2 [1, 3, 7, 9]
None

These are the issues:
new_arr.append(arr2[j:]) should be new_arr.extend(arr2[j:]). append is for appending one item to the list, while extend concatenates a second list to the first. The same change needs to happen in the second case.
As you count on getting the mutated list as a returned value, you should not discard the list that is returned by the recursive call. You should return it back to the caller, until the first caller gets it.
It is a bad idea to have new_arr a global value. If the main program would call the function a second time for some other input, new_arr will still have its previous values, polluting the result of the next call.
Although the first two fixes will make your function work (for a single test), the last issue would best be fixed by using a different pattern:
Let the recursive call return the list that merges the values that still needed to be analysed, i.e. from i and j onwards. The caller is then responsible of prepending its own value to that returned (partial) list. This way there is no more need of a global variable:
def merge(arr1, arr2, i, j):
if i >= len(arr1):
return arr2[j:]
if j >= len(arr2):
return arr1[i:]
if arr1[i] < arr2[j]:
return [arr1[i]] + merge(arr1, arr2, i + 1, j)
else:
return [arr2[j]] + merge(arr1, arr2, i, j + 1)
sortedarr = merge([1,9], [3,7,11,14,18,99], i, j)
print(sortedarr)

Note that Python already has a built-in function that knows how to merge sorted arrays, heapq.merge.
list(heapq.merge((1, 3, 5, 7), (2, 4, 6, 8)))
[1, 2, 3, 4, 5, 6, 7, 8]

Related

Cannot get the right output for swapping in list (python)

def largestPermutation(k, arr):
for i in range(k):
arr[i],arr[arr.index(max(arr[i:]))]=arr[arr.index(max(arr[i:]))],arr[i]
return arr
a =[4,2,3,5,1]
print(largestPermutation(1,a))
This is my code to return the largest permutation in a list with k swaps. I tried the Pen and paper method and got the code as follows but, the answer returns the same array I passed.
This is a HackerRank problem (Link : https://www.hackerrank.com/challenges/largest-permutation/problem)

The problem is that in the destination of the assignment, arr[arr.index(max(arr[i:])] is calculated after arr[i] has been assigned. So the maximum element has already been moved into the front position, so this will always return the index that you're swapping into and it gets its original value back.
Assign the index of the max element to a variable before doing the swap.
def largestPermutation(k, arr):
for i in range(k):
max_index = arr.index(max(arr[i:]))
arr[i],arr[max_index]=arr[max_index],arr[i]
return arr
a =[4,2,3,5,1]
print(largestPermutation(1,a))

You should assign it first as a variable:
def largestPermutation(k, arr):
for i in range(k):
x = arr.index(max(arr[i:]))
arr[i], arr[x] = arr[x], arr[i]
return arr
a =[4, 2, 3, 5, 1]
print(largestPermutation(1, a))
Output:
[5, 2, 3, 4, 1]

Permutations with repetition without two consecutive equal elements

I need a function that generates all the permutation with repetition of an iterable with the clause that two consecutive elements must be different; for example
f([0,1],3).sort()==[(0,1,0),(1,0,1)]
#or
f([0,1],3).sort()==[[0,1,0],[1,0,1]]
#I don't need the elements in the list to be sorted.
#the elements of the return can be tuples or lists, it doesn't change anything
Unfortunatly itertools.permutation doesn't work for what I need (each element in the iterable is present once or no times in the return)
I've tried a bunch of definitions; first, filterting elements from itertools.product(iterable,repeat=r) input, but is too slow for what I need.
from itertools import product
def crp0(iterable,r):
l=[]
for f in product(iterable,repeat=r):
#print(f)
b=True
last=None #supposing no element of the iterable is None, which is fine for me
for element in f:
if element==last:
b=False
break
last=element
if b: l.append(f)
return l
Second, I tried to build r for cycle, one inside the other (where r is the class of the permutation, represented as k in math).
def crp2(iterable,r):
a=list(range(0,r))
s="\n"
tab=" " #4 spaces
l=[]
for i in a:
s+=(2*i*tab+"for a["+str(i)+"] in iterable:\n"+
(2*i+1)*tab+"if "+str(i)+"==0 or a["+str(i)+"]!=a["+str(i-1)+"]:\n")
s+=(2*i+2)*tab+"l.append(a.copy())"
exec(s)
return l
I know, there's no need you remember me: exec is ugly, exec can be dangerous, exec isn't easy-readable... I know.
To understand better the function I suggest you to replace exec(s) with print(s).
I give you an example of what string is inside the exec for crp([0,1],2):
for a[0] in iterable:
if 0==0 or a[0]!=a[-1]:
for a[1] in iterable:
if 1==0 or a[1]!=a[0]:
l.append(a.copy())
But, apart from using exec, I need a better functions because crp2 is still too slow (even if faster than crp0); there's any way to recreate the code with r for without using exec? There's any other way to do what I need?

You could prepare the sequences in two halves, then preprocess the second halves to find the compatible choices.
def crp2(I,r):
r0=r//2
r1=r-r0
A=crp0(I,r0) # Prepare first half sequences
B=crp0(I,r1) # Prepare second half sequences
D = {} # Dictionary showing compatible second half sequences for each token
for i in I:
D[i] = [b for b in B if b[0]!=i]
return [a+b for a in A for b in D[a[-1]]]
In a test with iterable=[0,1,2] and r=15, I found this method to be over a hundred times faster than just using crp0.

You could try to return a generator instead of a list. With large values of r, your method will take a very long time to process product(iterable,repeat=r) and will return a huge list.
With this variant, you should get the first element very fast:
from itertools import product
def crp0(iterable, r):
for f in product(iterable, repeat=r):
last = f[0]
b = True
for element in f[1:]:
if element == last:
b = False
break
last = element
if b:
yield f
for no_repetition in crp0([0, 1, 2], 12):
print(no_repetition)
# (0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1)
# (1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0)

Instead of filtering the elements, you could generate a list directly with only the correct elements. This method uses recursion to create the cartesian product:
def product_no_repetition(iterable, r, last_element=None):
if r == 0:
return [[]]
else:
return [p + [x] for x in iterable
for p in product_no_repetition(iterable, r - 1, x)
if x != last_element]
for no_repetition in product_no_repetition([0, 1], 12):
print(no_repetition)

I agree with #EricDuminil's comment that you do not want "Permutations with repetition." You want a significant subset of the product of the iterable with itself multiple times. I don't know what name is best: I'll just call them products.
Here is an approach that builds each product line without building all the products then filtering out the ones you want. My approach is to work primarily with the indices of the iterable rather than the iterable itself--and not all the indices, but ignoring the last one. So instead of working directly with [2, 3, 5, 7] I work with [0, 1, 2]. Then I work with the products of those indices. I can transform a product such as [1, 2, 2] where r=3 by comparing each index with the previous one. If an index is greater than or equal to the previous one I increment the current index by one. This prevents two indices from being equal, and this also gets be back to using all the indices. So [1, 2, 2] is transformed to [1, 2, 3] where the final 2 was changed to a 3. I now use those indices to select the appropriate items from the iterable, so the iterable [2, 3, 5, 7] with r=3 gets the line [3, 5, 7]. The first index is treated differently, since it has no previous index. My code is:
from itertools import product
def crp3(iterable, r):
L = []
for k in range(len(iterable)):
for f in product(range(len(iterable)-1), repeat=r-1):
ndx = k
a = [iterable[ndx]]
for j in range(r-1):
ndx = f[j] if f[j] < ndx else f[j] + 1
a.append(iterable[ndx])
L.append(a)
return L
Using %timeit in my Spyder/IPython configuration on crp3([0,1], 3) shows 8.54 µs per loop while your crp2([0,1], 3) shows 133 µs per loop. That shows a sizeable speed improvement! My routine works best where iterable is short and r is large--your routine finds len ** r lines (where len is the length of the iterable) and filters them while mine finds len * (len-1) ** (r-1) lines without filtering.
By the way, your crp2() does do filtering, as shown by the if lines in your code that is execed. The sole if in my code does not filter a line, it modifies an item in the line. My code does return surprising results if the items in the iterable are not unique: if that is a problem, just change the iterable to a set to remove the duplicates. Note that I replaced your l name with L: I think l is too easy to confuse with 1 or I and should be avoided. My code could easily be changed to a generator: replace L.append(a) with yield a and remove the lines L = [] and return L.

How about:
from itertools import product
result = [ x for x in product(iterable,repeat=r) if all(x[i-1] != x[i] for i in range(1,len(x))) ]

Elaborating on #peter-de-rivaz's idea (divide and conquer). When you divide the sequence to create into two subsequences, those subsequences are the same or very close. If r = 2*k is even, store the result of crp(k) in a list and merge it with itself. If r=2*k+1, store the result of crp(k) in a list and merge it with itself and with L.
def large(L, r):
if r <= 4: # do not end the divide: too slow
return small(L, r)
n = r//2
M = large(L, r//2)
if r%2 == 0:
return [x + y for x in M for y in M if x[-1] != y[0]]
else:
return [x + y + (e,) for x in M for y in M for e in L if x[-1] != y[0] and y[-1] != e]
small is an adaptation from #eric-duminil's answer using the famous for...else loop of Python:
from itertools import product
def small(iterable, r):
for seq in product(iterable, repeat=r):
prev, *tail = seq
for e in tail:
if e == prev:
break
prev = e
else:
yield seq
A small benchmark:
print(timeit.timeit(lambda: crp2( [0, 1, 2], 10), number=1000))
#0.16290732200013736
print(timeit.timeit(lambda: crp2( [0, 1, 2, 3], 15), number=10))
#24.798989593000442
print(timeit.timeit(lambda: large( [0, 1, 2], 10), number=1000))
#0.0071403849997295765
print(timeit.timeit(lambda: large( [0, 1, 2, 3], 15), number=10))
#0.03471425700081454

Translating pseudocode into Python

This is the pseudocode I was given:
COMMENT: define a function sort1
INPUT: a list of numbers my list
print the initial list
loop over all positions i in the list; starting with the second element (index 1)
COMMENT: at this point the elements from 0 to i-1 in this list are sorted
loop backward over those positions j in the list lying to the left of i; starting at position i-1 continue this loop as long as the value at j+1 is less than the value at j
swap the values at positions j and j+1
print the current list
And this is the python code I came up with:
#define a function sort1
my_list=range(1,40)
print
print my_list
num_comparisons=0
num_swaps=0
for pos in range (0,len(my_list)-1):
for i in range(pos+1,len(my_list)): # starting at position i-1 continue this loop as long
# as the value at j+1 is less than the value at j
num_comparisons+=1
if my_list[i]<my_list[pos]:
num_swaps+=1
[my_list[i],my_list[pos]]=[my_list[pos],my_list[i]]
print my_list
print
print num_comparisons, num_swaps
I'm not sure I did it correctly though.

As I said in a comment, I think you effectively do have the j (as well as the i) the pseudocode COMMENT talks about. However, in your code, i is the variable pos, which would make the j what is named i in your code.
To see if your code works, you need to initially have an unsorted list—not the my_list=range(1,40) in your code (which is [1, 2, 3, ... 38, 39] and already in numerical order).
One thing you didn't do is define a sort1() function.
What is below is essentially your code, but I renamed the two variables to match the pseudocode COMMENT, and put (most of it) in a function definition where it's supposed to be.
Beyond that, I had to declare the variables num_comparisons and num_swaps (which aren't mentioned in the psuedocode) as global so they could be accessed outside of the function—otherwise they would have been local variables by default and only accessible within the function.
def sort1(items):
""" Sort given list of items in-place. """
# allow access to these variables outside of function
global num_comparisons
global num_swaps
# initialize global variables
num_comparisons = 0
num_swaps = 0
# starting at position i-1 continue this loop as long
# as the value at j+1 is less than the value at j
for i in range(0, len(items)-1):
for j in range(i+1, len(items)):
num_comparisons += 1
if items[j] < items[i]:
num_swaps += 1
[items[j], items[i]] = [items[i], items[j]]
my_list = [6, 3, 7, 2, 9, 4, 5]
print 'my_list before sort:'
print my_list
sort1(my_list)
print 'my_list after sort:'
print my_list
print
print 'num_comparisons:', num_comparisons, ', num_swaps:', num_swaps
Output:
my_list before sort:
[6, 3, 7, 2, 9, 4, 5]
my_list after sort:
[2, 3, 4, 5, 6, 7, 9]
num_comparisons: 21 , num_swaps: 10

How to get the index of specific item in python matrix

I am newbie to Python programming language. And I am looking for How to get the indexes (line and column ) of specific element in matrix.
In other I way I want to do the same as this source code using lists.
myList=[1,10,54,85]
myList.index(54)
Best Regards

Here's a simple function which returns the coordinates as a tuple (or None if no index is found). Note that this is for 2D matrices, and returns the first instance of the element in the matrix.
(Edit: see hiro protagonist's answer for an alternative Pythonic version)
def find(element, matrix):
for i in range(len(matrix)):
for j in range(len(matrix[i])):
if matrix[i][j] == element:
return (i, j)
Or, if you want to find all indexes rather than just the first:
def findall(element, matrix):
result = []
for i in range(len(matrix)):
for j in range(len(matrix[i])):
if matrix[i][j] == element:
result.append((i, j))
return result
You can use it like so:
A = [[5, 10],
[15, 20],
[25, 5]]
find(25, A) # Will return (2, 0)
find(50, A) # Will return None
findall(5, A) # Will return [(0, 0), (2, 1)]
findall(4, A) # Will return []

a (in my opinion) more pythonic version of FlipTack's algorithm:
def find(element, matrix):
for i, matrix_i in enumerate(matrix):
for j, value in enumerate(matrix_i):
if value == element:
return (i, j)
in python it is often more natural to iterate over elements of lists instead of just the indices; if indices are needed as well, enumerate helps. this is also more efficient.
note: just as list.index (without a second argument) this will only find the first occurrence.

Since you say that you're a beginner, pardon me if you already know some of the below. Just in case I'll describe the basic logic you can use to write your own function or understand the other answers posted here better:
To access an element in a specific row of a list, for example, if you wanted to get the first element and save it in a variable:
myList=[1,10,54,85]
myvar = myList[0] # note that you access the first element with index 0
myvar now stores 1. Why index 0? Think of the index as an indicator of "how far away from the beginning of the list an element is." In other words, the first element is a distance of 0 from the start.
What if you have a multi-dimensional list like so?
multi = [[0, 1, 2],
[3, 4, 5],
[6, 7, 8]
]
Now you think in terms of row and column (and of course you could have n-dimensional lists and keep going).
How to retrieve the 5? That is a distance of 1 row from the start of the list of rows and 2 columns away from the start of the sub-list.
Then:
myvar = multi[1][2]
retrieves the 5.
FlipTack's and hiro protagonist's functions wrap this logic in the nice compact procedures, which search the entire 2-dimensional list, comparing elements until the desired one is found, then returning a tuple of the indices or continuing to search for duplicate elements. Note that if your lists are guaranteed to sorted you can then use a binary search algorithm across rows and columns and get the answer faster, but no need to worry about that for now.
Hopefully this helps.

You can also add a tag to your function to search the occurrences of your input matrix/list.
For example:
If you input is 1D vector:
def get_index_1d(a = [], val = 0, occurrence_pos = False):
if not occurrence_pos:
for k in range(len(a)):
if a[k] == val:
return k
else:
return [k for k in range(len(a)) if a[k] == val]
Output:
a = [1,10,54,85, 10]
index = get_index_1d(a, 10, False)
print("Without occurrence: ", index)
index = get_index_1d(a, 10, True)
print("With occurrence: ", index)
>>> Without occurrence: 1
>>> With occurrence: [1, 4]
For 2D vector:
def get_index_2d(a = [], val = 0, occurrence_pos = False):
if not occurrence_pos:
for k in range(len(a)):
for j in range(len(a[k])):
if a[k][j] == val:
return (k, j)
else:
return [(k, j) for k in range(len(a)) for j in range(len(a[k])) if a[k][j] == val]
Output:
b = [[1,2],[3,4],[5,6], [3,7]]
index = get_index_2d(b, 3, False)
print("Without occurrence: ", index)
index = get_index_2d(b, 3, True)
print("With occurrence: ", index)
>>> Without occurrence: (1, 0)
>>> With occurrence: [(1, 0), (3, 0)]

Just wanted to throw another solution since I didn't see it above:
def find(matrix, value):
value_indexs = [ ( matrix.index(row), row.index(value) ) for row in matrix if value in row]
return value_indexs
Example:
matrix = [
[0, 1, 2],
[3, 4, 5, 6],
[7, 8, 9, 0]
]
find(matrix, 0)
Returns: [(0,0), (2,3)]

Getting the indices of the X largest numbers in a list

Please no built-ins besides len() or range(). I'm studying for a final exam.
Here's an example of what I mean.
def find_numbers(x, lst):
lst = [3, 8, 1, 2, 0, 4, 8, 5]
find_numbers(3, lst) # this should return -> (1, 6, 7)
I tried this not fully....couldn't figure out the best way of going about it:
def find_K_highest(lst, k):
newlst = [0] * k
maxvalue = lst[0]
for i in range(len(lst)):
if lst[i] > maxvalue:
maxvalue = lst[i]
newlst[0] = i

Take the first 3 (x) numbers from the list. The minimum value for the maximum are these. In your case: 3, 8, 1. Their index is (0, 1, 2). Build pairs of them ((3,0), (8,1), (1,2)).
Now sort them by size of the maximum value: ((8,1), (3,0), (1,2)).
With this initial List, you can traverse the rest of the list recursively. Compare the smallest value (1, _) with the next element in the list (2, 3). If that is larger (it is), sort it into the list ((8,1), (3,0), (2,3)) and throw away the smallest.
In the beginning you have many changes in the top 3, but later on, they get rare. Of course you have to keep book about the last position (3, 4, 5, ...) too, when traversing.
An insertion sort for the top N elements should be pretty performant.
Here is a similar problem in Scala but without the need to report the indexes.

I dont know is it good to post a solution, but this seems to work:
def find_K_highest(lst, k):
# escape index error
if k>len(lst):
k=len(lst)
# the output array
idxs = [None]*k
to_watch = range(len(lst))
# do it k times
for i in range(k):
# guess that max value is at least at idx '0' of to_watch
to_del=0
idx = to_watch[to_del]
max_val = lst[idx]
# search through the list for bigger value and its index
for jj in range(len(to_watch)):
j=to_watch[jj]
val = lst[j]
# check that its bigger that previously finded max
if val > max_val:
idx = j
max_val = val
to_del=jj
# append it
idxs[i] = idx
del to_watch[to_del]
# return answer
return idxs
PS I tried to explain every line of code.

Can you use list methods? (e.g. append, sort, index?). If so, this should work (I think...)
def find_numbers(n,lst):
ll=lst[:]
ll.sort()
biggest=ll[-n:]
idx=[lst.index(i) for i in biggest] #This has the indices already, but we could have trouble if one of the numbers appeared twice
idx.sort()
#check for duplicates. Duplicates will always be next to each other since we sorted.
for i in range(1,len(idx)):
if(idx[i-1]==idx[i]):
idx[i]=idx[i]+lst[idx[i]+1:].index(lst[idx[i]]) #found a duplicate, chop up the input list and find the new index of that number
idx.sort()
return idx
lst = [3, 8, 1, 2, 0, 4, 8, 5]
print find_numbers(3, lst)

Dude. You have two ways you can go with this.
First way is to be clever. Phyc your teacher out. What she is looking for is recursion. You can write this with NO recursion and NO built in functions or methods:
#!/usr/bin/python
lst = [3, 8, 1, 2, 0, 4, 8, 5]
minval=-2**64
largest=[]
def enum(lst):
for i in range(len(lst)):
yield i,lst[i]
for x in range(3):
m=minval
m_index=None
for i,j in enum(lst):
if j>m:
m=j
m_index=i
if m_index:
largest=largest+[m_index]
lst[m_index]=minval
print largest
This works. It is clever. Take that teacher!!! BUT, you will get a C or lower...
OR -- you can be the teacher's pet. Write it the way she wants. You will need a recursive max of a list. The rest is easy!
def max_of_l(l):
if len(l) <= 1:
if not l:
raise ValueError("Max() arg is an empty sequence")
else:
return l[0]
else:
m = max_of_l(l[1:])
return m if m > l[0] else l[0]
print max_of_l([3, 8, 1, 2, 0, 4, 8, 5])

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.