How to find lowest 3 values in a matrix - python

So i have this fucntion to find the lowest value in a matrix, and return its position in the matrix, ie its indices:
final_matrix=[[3.57 2.71 9.2 5.63]
[4.42 1.4 3.53 8.97]
[1.2 0.33 6.26 7.77]
[6.36 3.6 8.91 7.42]
[1.59 0.9 2.4 4.24]] # this changes in my code, im just giving a very simple version of it here
def lowest_values(final_matrix):
best_value=10000 #or any arbitrarily high number
for i in range(0,len(final_matrix[:,0])):
for j in range(0,len(final_matrix[0,:])):
if final_matrix[i,j]<best_value:
best_value=final_matrix[i,j]
lowest_val_i=i
lowest_val_j=j
return(lowest_val_i, lowest_val_j)
this returns me (1,2), which just by visual analysis is correct. i want to now find the lowest 3 values - hopefully, to build into this loop. But i really cannot think how! Or at least i dont know how to implement it. I was thinking of some if-else loop, that if the lowest value is already found, then 'void' this one and found 2nd lowest, and then same thing to find the third. But im not sure.
Please dont be too quick to shut this question down - im very new to programming, and very stuck!

The human Approach
I think my approach to this is different enough from different answers to share.
I am only doing 3 comparisons for every list element so it should be O(n). Also I'm not creating a entirely new list with (value, indices) tuple of all the elements.
matrix=[[3.57, 2.71, 9.2, 5.63],
[4.42, 1.4, 3.53, 8.97],
[1.2, 0.33, 6.26, 7.77],
[6.36, 3.6, 8.91, 7.42],
[1.59, 0.9, 2.4, 4.24]]
def compare_least_values(value, i, j):
global least
if value < least[2][0] :
if value < least[1][0] :
if value < least[0][0] :
least.insert(0, (value,(i,j)))
else:
least.insert(1, (value,(i,j)))
else:
least.insert(2, (value,(i,j)))
def lowest_three_values(matrix):
global least
least = [(10000, (None, None)), (10000, (None, None)), (10000, (None, None))]
for i, row in enumerate(matrix):
for j, value in enumerate(row):
compare_least_values(value, i, j)
return least[:3]
print(lowest_three_values(matrix))
Output:
[(0.33, (2, 1)), (0.9, (4, 1)), (1.2, (2, 0))]
The practical approach (Numpy)
If you're familiar with numpy than this is the way to go. Even if you're not it can be use as a copy-paste snippet.
import numpy as np
matrix=[[3.57, 2.71, 9.2, 5.63],
[4.42, 1.4, 3.53, 8.97],
[1.2, 0.33, 6.26, 7.77],
[6.36, 3.6, 8.91, 7.42],
[1.59, 0.9, 2.4, 4.24]]
matrix = np.array(matrix)
indices_1d = np.argpartition(matrix, 3, axis=None)[:3]
indices_2d = np.unravel_index(indices_1d, matrix.shape)
least_three = matrix[indices_2d]
print('least three values : ', least_three)
print('indices : ', *zip(*indices_2d) )
Output:
least three values : [0.33 0.9 1.2 ]
indices : (2, 1) (4, 1) (2, 0)
See this Stackoverflow query for detailed answer on this.

I didn't understand that return thing (1,2). Lowest value of matrix is 0.33. and that's position is (2,1).
So my solution for your code;
all_items = []
# i appended matrix's all items to one list
for row in final_matrix:
for i in row:
all_items.append(i)
# and i sorted that list as from min to max
all_items.sort()
# then i took first 3 values
lowest_3 = all_items[0:3]
positions = []
# and i append their positions that's in the matrix into positions
for i in lowest_3:
for row in range(len(final_matrix)):
if(i in final_matrix[row]):
positions.append([row, final_matrix[row].index(i)])
break
# lowest_3 = [0.33, 0.9, 1.2]
# positions = [[2, 1], [4, 1], [2, 0]]

this is my solution:
final_matrix=[[3.57, 2.71, 9.2, 5.63],
[4.42, 1.4, 3.53, 8.97],
[1.2, 0.33, 6.26, 7.77],
[6.36, 3.6, 8.91, 7.42],
[1.59, 0.9, 2.4, 4.24]]
min_values = []
for i in range(3):
mini = final_matrix[0][0]
for row in final_matrix:
for n in row:
if n < mini:
mini = n
n_index = row.index(n)
row_index = final_matrix.index(row)
min_values.append(mini)
del final_matrix[row_index][n_index]
print("Finals {}".format(min_values))
Let me explains you:
The first loop through how many min values you want (change it, you will see what I mean)
the second and the last one, loop through the matrix to take the minimun value
The line del final_matrix[row_index][n_index] will destroy the minimal number IN the original matrix
So if you want the keep the original matrix you have to create a new one and copy the original in => use deepcopy() from the copy module

You are looking for the N smallest items from a list of items (or a "matrix" in this case) and when N is small you can do better than sorting the items of your list by creating a heap queue, which takes linear time and then by popping the N smallest elements which is an O(log N) operation for each pop. The heap queue is an important data structure, which you should study.
import heapq
final_matrix=[[3.57, 2.71, 9.2, 5.63],
[4.42, 1.4, 3.53, 8.97],
[1.2, 0.33, 6.26, 7.77],
[6.36, 3.6, 8.91, 7.42],
[1.59, 0.9, 2.4, 4.24]]
def lowest_values(final_matrix):
# create a flat list, l, from the matrix
# each element is a tuple: (value, x-coordinate, y-coordinate)
l = [(final_matrix[x][y], x, y)
for x in range(len(final_matrix))
for y in range(len(final_matrix[0]))
]
heapq.heapify(l) # O(N) time
for _ in range(3):
# pop next smallest tuple:
value, x, y = heapq.heappop(l) # O(log N) time
print(f'value={value}, x={x}, y={y}')
lowest_values(final_matrix)
Prints:
value=0.33, x=2, y=1
value=0.9, x=4, y=1
value=1.2, x=2, y=0
Note
The above code could have been simplified to the following, which is probably even slightly more efficient if all you want are the 3 smallest items and then you have no further need for the heap queue structure. But I wanted to show the two basic operations of creating a heap queue from a list and then successively producing the smallest items from that heap queue:
import heapq
final_matrix=[[3.57, 2.71, 9.2, 5.63],
[4.42, 1.4, 3.53, 8.97],
[1.2, 0.33, 6.26, 7.77],
[6.36, 3.6, 8.91, 7.42],
[1.59, 0.9, 2.4, 4.24]]
def lowest_values(final_matrix):
# create a flat list, l, from the matrix
# each element is a tuple: (value, x-coordinate, y-coordinate)
l = [(final_matrix[x][y], x, y)
for x in range(len(final_matrix))
for y in range(len(final_matrix[0]))
]
for value, x, y in heapq.nsmallest(3, l):
print(f'value={value}, x={x}, y={y}')
lowest_values(final_matrix)

Related

Index values from 2D to 1D array

I have defined a function which searches for the column and row index of the minimum value for a given 2D array (main_array). In this case, the minimum value for main_array is 1.1, so index should be [0,2]. I then must use the column index value 0 to input into another given 1D array A_array, and similarly the row index value 2 into another given 1D array B_array, which is the part I am struggling with.
The following is my code so far:
import numpy as np
main_array = np.array([[3.1, 2.1, 1.1],
[4.1, 1.6, 2.4],
[2.2, 3.2, 3.6],
[1.5, 2.5, 3.5]])
A_array = np.array([3.7, 4.7, 5.7, 6.7])
B_array = np.array([1.5, 1.8, 2.1])
def min_picks(main_array,A_array,B_array):
min_index = np.argwhere(main_array == np.min(main_array)) #this gives [[0 2]]
A_pick = A_array[min_index[0]]
B_pick = B_array[min_index[-1]]
return A_pick, B_pick
The function should return an expected answer of A_array[0] which is assigned to A_pick, and B_array[2] assigned to B_pick.
You can use reduce to flatten the min_index and simply access what you need from that flatten list.
from functools import reduce
def min_picks(main_array,A_array,B_array):
min_index = reduce(lambda z, y :z + y, np.argwhere(main_array == np.min(main_array)))
A_pick = A_array[min_index[0]]
B_pick = B_array[min_index[1]]
return A_pick, B_pick
print(min_picks(main_array, A_array, B_array))
This will give you:
(3.7, 2.1)
Your array index is not correct. Try the following instead:
main_array = np.array([[3.1, 2.1, 1.1],
[4.1, 1.6, 2.4],
[2.2, 3.2, 3.6],
[1.5, 2.5, 3.5]])
A_array = np.array([3.7, 4.7, 5.7, 6.7])
B_array = np.array([1.5, 1.8, 2.1])
def min_picks(main_array,A_array,B_array):
min_index = np.argwhere(main_array == np.min(main_array)) #this gives [[0 2]]
A_pick = A_array[min_index[:,0]][0]
B_pick = B_array[min_index[:,1]][0]
return A_pick, B_pick
>>> min_picks(main_array,A_array,B_array)
#(3.7, 2.1)

Pythonic way to remove elements from Numpy array closer than threshold

What is the best way to remove the minimal number of elements from a sorted Numpy array so that the minimal distance among the remaining is always bigger than a certain threshold?
For example, if the threshold is 1, the following sequence [0.1, 0.5, 1.1, 2.5, 3.] will become [0.1, 1.1, 2.5]. The 0.5 is removed because it is too close to 0.1 but then 1.1 is preserved because it is far enough from 0.1.
My current code:
import numpy as np
MIN_DISTANCE = 1
a = np.array([0.1, 0.5, 1.1, 2.5, 3.])
for i in range(len(a)-1):
if(a[i+1] - a[i] < MIN_DISTANCE):
a[i+1] = a[i]
a = np.unique(a)
a
array([0.1, 1.1, 2.5])
Is there a more efficient way to do so?
Note that my question is similar to Remove values from numpy array closer to each other but not exactly the same.
You could use numpy.ufunc.accumulate to iterate thru adjacent pairs of the array instead of the for loop.
The numpy.add.accumulate example or itertools.accumulate probably shows best what it's doing.
Along with numpy.frompyfunc your condition can be applied as ufunc (universal functions ).
Code: (with an extended array to cross check some additional cases, but works with your array as well)
import numpy as np
MIN_DISTANCE = 1
a = np.array([0.1, 0.5, 0.6, 0.7, 1.1, 2.5, 3., 4., 6., 6.1])
print("original: \n" + str(a))
def my_py_function(arr1, arr2):
if(arr2 - arr1 < MIN_DISTANCE):
arr2 = arr1
return arr2
my_np_function = np.frompyfunc(my_py_function, 2, 1)
my_np_function.accumulate(a, dtype=np.object, out=a).astype(float)
print("complete: \n" + str(a))
a = np.unique(a)
print("unique: \n" + str(a))
Result:
original:
[0.1 0.5 0.6 0.7 1.1 2.5 3. 4. 6. 6.1]
complete:
[0.1 0.1 0.1 0.1 1.1 2.5 2.5 4. 6. 6. ]
unique:
[0.1 1.1 2.5 4. 6. ]
Concerning execution time timeit shows a turnaround at array length of about 20.
Your code is much faster (relative) for your array length of 5
whereas for array length >>20 the accumulate option speeds up considerably (~35% in time for array length 300)

Recursively dividing up a list, based on the endpoints

Here is what I am trying to do.
Take the list:
list1 = [0,2]
This list has start point 0 and end point 2.
Now, if we were to take the midpoint of this list, the list would become:
list1 = [0,1,2]
Now, if we were to recursively split up the list again (take the midpoints of the midpoints), the list would becomes:
list1 = [0,.5,1,1.5,2]
I need a function that will generate lists like this, preferably by keeping track of a variable. So, for instance, let's say there is a variable, n, that keeps track of something. When n = 1, the list might be [0,1,2] and when n = 2, the list might be [0,.5,1,1.5,2], and I am going to increment the value of to keep track of how many times I have divided up the list.
I know you need to use recursion for this, but I'm not sure how to implement it.
Should be something like this:
def recursive(list1,a,b,n):
"""list 1 is a list of values, a and b are the start
and end points of the list, and n is an int representing
how many times the list needs to be divided"""
int mid = len(list1)//2
stuff
Could someone help me write this function? Not for homework, part of a project I'm working on that involves using mesh analysis to divide up rectangle into parts.
This is what I have so far:
def recursive(a,b,list1,n):
w = b - a
mid = a + w / 2
left = list1[0:mid]
right = list1[mid:len(list1)-1]
return recursive(a,mid,list1,n) + mid + recursive(mid,b,list1,n)
but I'm not sure how to incorporate n into here.
NOTE: The list1 would initially be [a,b] - I would just manually enter that but I'm sure there's a better way to do it.
You've generated some interesting answers. Here are two more.
My first uses an iterator to avoid
slicing the list and is recursive because that seems like the most natural formulation.
def list_split(orig, n):
if not n:
return orig
else:
li = iter(orig)
this = next(li)
result = [this]
for nxt in li:
result.extend([(this+nxt)/2, nxt])
this = nxt
return list_split(result, n-1)
for i in range(6):
print(i, list_split([0, 2], i))
This prints
0 [0, 2]
1 [0, 1.0, 2]
2 [0, 0.5, 1.0, 1.5, 2]
3 [0, 0.25, 0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2]
4 [0, 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1.0, 1.125, 1.25, 1.375, 1.5, 1.625, 1.75, 1.875, 2]
5 [0, 0.0625, 0.125, 0.1875, 0.25, 0.3125, 0.375, 0.4375, 0.5, 0.5625, 0.625, 0.6875, 0.75, 0.8125, 0.875, 0.9375, 1.0, 1.0625, 1.125, 1.1875, 1.25, 1.3125, 1.375, 1.4375, 1.5, 1.5625, 1.625, 1.6875, 1.75, 1.8125, 1.875, 1.9375, 2]
My second is based on the observation that recursion isn't necessary if you always start from two elements. Suppose those elements are mn and mx. After N applications of the split operation you will have 2^N+1 elements in it, so the numerical distance between the elements will be (mx-mn)/(2**N).
Given this information it should therefore be possible to deterministically compute the elements of the array, or even easier to use numpy.linspace like this:
def grid(emin, emax, N):
return numpy.linspace(emin, emax, 2**N+1)
This appears to give the same answers, and will probably serve you best in the long run.
You can use some arithmetic and slicing to figure out the size of the result, and fill it efficiently with values.
While not required, you can implement a recursive call by wrapping this functionality in a simple helper function, which checks what iteration of splitting you are on, and splits the list further if you are not at your limit.
def expand(a):
"""
expands a list based on average values between every two values
"""
o = [0] * ((len(a) * 2) - 1)
o[::2] = a
o[1::2] = [(x+y)/2 for x, y in zip(a, a[1:])]
return o
def rec_expand(a, n):
if n == 0:
return a
else:
return rec_expand(expand(a), n-1)
In action
>>> rec_expand([0, 2], 2)
[0, 0.5, 1.0, 1.5, 2]
>>> rec_expand([0, 2], 4)
[0,
0.125,
0.25,
0.375,
0.5,
0.625,
0.75,
0.875,
1.0,
1.125,
1.25,
1.375,
1.5,
1.625,
1.75,
1.875,
2]
You could do this with a for loop
import numpy as np
def add_midpoints(orig_list, n):
for i in range(n):
new_list = []
for j in range(len(orig_list)-1):
new_list.append(np.mean(orig_list[j:(j+2)]))
orig_list = orig_list + new_list
orig_list.sort()
return orig_list
add_midpoints([0,2],1)
[0, 1.0, 2]
add_midpoints([0,2],2)
[0, 0.5, 1.0, 1.5, 2]
add_midpoints([0,2],3)
[0, 0.25, 0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2]
You can also do this totally non-recursively and without looping. What we're doing here is just making a binary scale between two numbers like on most Imperial system rulers.
def binary_scale(start, stop, level):
length = stop - start
scale = 2 ** level
return [start + i * length / scale for i in range(scale + 1)]
In use:
>>> binary_scale(0, 10, 0)
[0.0, 10.0]
>>> binary_scale(0, 10, 2)
[0.0, 2.5, 5.0, 7.5, 10.0]
>>> binary_scale(10, 0, 1)
[10.0, 5.0, 0.0]
Fun with anti-patterns:
def expand(a, n):
for _ in range(n):
a[:-1] = sum(([a[i], (a[i] + a[i + 1]) / 2] for i in range(len(a) - 1)), [])
return a
print(expand([0, 2], 2))
OUTPUT
% python3 test.py
[0, 0.5, 1.0, 1.5, 2]
%

"Unsorting" a Quicksort

(Quick note! While I know there are plenty of options for sorting in Python, this code is more of a generalized proof-of-concept and will later be ported to another language, so I won't be able to use any specific Python libraries or functions.
In addition, the solution you provide doesn't necessarily have to follow my approach below.)
Background
I have a quicksort algorithm and am trying to implement a method to allow later 'unsorting' of the new location of a sorted element. That is, if element A is at index x and is sorted to index y, then the 'pointer' (or, depending on your terminology, reference or mapping) array changes its value at index x from x to y.
In more detail:
You begin the program with an array, arr, with some given set of numbers. This array is later run through a quick sort algorithm, as sorting the array is important for future processing on it.
The ordering of this array is important. As such, you have another array, ref, which contains the indices of the original array such that when you map the reference array to the array, the original ordering of the array is reproduced.
Before the array is sorted, the array and mapping looks like this:
arr = [1.2, 1.5, 1.5, 1.0, 1.1, 1.8]
ref = [0, 1, 2, 3, 4, 5]
--------
map(arr,ref) -> [1.2, 1.5, 1.5, 1.0, 1.1, 1.8]
You can see that index 0 of ref points to index 0 of arr, giving you 1.2. Index 1 of ref points to index 1 of arr, giving you 1.5, and so on.
When the algorithm is sorted, ref should be rearranged such that when you map it according to the above procedure, it generates the pre-sorted arr:
arr = [1.0, 1.1, 1.2, 1.5, 1.5, 1.8]
ref = [2, 3, 4, 0, 1, 5]
--------
map(arr,ref) -> [1.2, 1.5, 1.5, 1.0, 1.1, 1.8]
Again, index 0 of ref is 2, so the first element of the mapped array is arr[2]=1.2. Index 1 of ref is 3, so the second element of the mapped array is arr[3]=1.5, and so on.
The Issue
The current implementation of my code works great for sorting, but horrible for the remapping of ref.
Given the same array arr, the output of my program looks like this:
arr = [1.0, 1.1, 1.2, 1.5, 1.5, 1.8]
ref = [3, 4, 0, 1, 2, 5]
--------
map(arr,ref) -> [1.5, 1.5, 1.0, 1.1, 1.2, 1.8]
This is a problem because this mapping is definitely not equal to the original:
[1.5, 1.5, 1.0, 1.1, 1.2, 1.8] != [1.2, 1.5, 1.5, 1.0, 1.1, 1.8]
My approach has been this:
When elements a and b, at indices x and y in arr are switched,
Then set ref[x] = y and ref[y] = x.
This is not working and I can't think of another solution that doesn't need O(n^2) time.
Thank you!
Minimally Reproducible Example
testing = [1.5, 1.2, 1.0, 1.0, 1.2, 1.2, 1.5, 1.3, 2.0, 0.7, 0.2, 1.4, 1.2, 1.8, 2.0, 2.1]
# This is the 'map(arr,ref) ->' function
def print_links(a,b):
tt = [a[b[i]-1] for i in range(0,len(a))]
print("map(arr,ref) -> {}".format(tt))
# This tests the re-mapping against an original copy of the array
f = 0
for i in range(0,len(testing)):
if testing[i] == tt[i]:
f += 1
print("{}/{}".format(f,len(a)))
def quick_sort(arr,ref,first=None,last=None):
if first == None:
first = 0
if last == None:
last = len(arr)-1
if first < last:
split = partition(arr,ref,first,last)
quick_sort(arr,ref,first,split-1)
quick_sort(arr,ref,split+1,last)
def partition(arr,ref,first,last):
pivot = arr[first]
left = first+1
right = last
done = False
while not done:
while left <= right and arr[left] <= pivot:
left += 1
while arr[right] >= pivot and right >= left:
right -= 1
if right < left:
done = True
else:
temp = arr[left]
arr[left] = arr[right]
arr[right] = temp
# This is my attempt at preserving indices part 1
temp = ref[left]
ref[left] = ref[right]
ref[right] = temp
temp = arr[first]
arr[first] = arr[right]
arr[right] = temp
# This is my attempt at preserving indices part 2
temp = ref[first]
ref[first] = ref[right]
ref[right] = temp
return right
# Main body of code
a = [1.5,1.2,1.0,1.0,1.2,1.2,1.5,1.3,2.0,0.7,0.2,1.4,1.2,1.8,2.0,2.1]
b = range(1,len(a)+1)
print("The following should match:")
print("a = {}".format(a))
a0 = a[:]
print("ref = {}".format(b))
print("----")
print_links(a,b)
print("\nQuicksort:")
quick_sort(a,b)
print(a)
print("\nThe following should match:")
print("arr = {}".format(a0))
print("ref = {}".format(b))
print("----")
print_links(a,b)
You can do what you ask, but when we have to do something like this in real life, we usually mess with the sort's comparison function instead of the swap function. Sorting routines provided with common languages usually have that capability built in so you don't have to write your own sort.
In this procedure, you sort the ref array (called order below), by the value of the arr value it points to. The generates the same ref array you already have, but without modifying arr.
Mapping with this ordering sorts the original array. You expected it to unsort the sorted array, which is why your code isn't working.
You can invert this ordering to get the ref array you were originally looking for, or you can just leave arr unsorted and map it through order when you need it ordered.
arr = [1.5, 1.2, 1.0, 1.0, 1.2, 1.2, 1.5, 1.3, 2.0, 0.7, 0.2, 1.4, 1.2, 1.8, 2.0, 2.1]
order = range(len(arr))
order.sort(key=lambda i:arr[i])
new_arr = [arr[order[i]] for i in range(len(arr))]
print("original array = {}".format(arr))
print("sorted ordering = {}".format(order))
print("sorted array = {}".format(new_arr))
ref = [0]*len(order)
for i in range(len(order)):
ref[order[i]]=i
unsorted = [new_arr[ref[i]] for i in range(len(ref))]
print("unsorted after sorting = {}".format(unsorted))
Output:
original array = [1.5, 1.2, 1.0, 1.0, 1.2, 1.2, 1.5, 1.3, 2.0, 0.7, 0.2, 1.4, 1.2, 1.8, 2.0, 2.1]
sorted ordering = [10, 9, 2, 3, 1, 4, 5, 12, 7, 11, 0, 6, 13, 8, 14, 15]
sorted array = [0.2, 0.7, 1.0, 1.0, 1.2, 1.2, 1.2, 1.2, 1.3, 1.4, 1.5, 1.5, 1.8, 2.0, 2.0, 2.1]
unsorted after sorting = [1.5, 1.2, 1.0, 1.0, 1.2, 1.2, 1.5, 1.3, 2.0, 0.7, 0.2, 1.4, 1.2, 1.8, 2.0, 2.1]
You don't need to maintain the map of indices and elements,just sort the indices as you sort your array.for example:
unsortedArray = [1.2, 1.5, 2.1]
unsortedIndexes = [0, 1, 2]
sortedAray = [1.2, 1.5, 2.1]
then you just swap 0 and 1as you sort unsortedArray.and get the sortedIndexes[1, 0, 2],you can get the origin array by sortedArray[1],sortedArray[0],sortedArray[2].
def inplace_quick_sort(s, indexes, start, end):
if start>= end:
return
pivot = getPivot(s, start, end)#it's should be a func
left = start
right = end - 1
while left <= right:
while left <= right and customCmp(pivot, s[left]):
# s[left] < pivot:
left += 1
while left <= right and customCmp(s[right], pivot):
# pivot < s[right]:
right -= 1
if left <= right:
s[left], s[right] = s[right], s[left]
indexes[left], indexes[right] = indexes[right], indexes[left]
left, right = left + 1, right -1
s[left], s[end] = s[end], s[left]
indexes[left], indexes[end] = indexes[end], indexes[left]
inplace_quick_sort(s, indexes, start, left-1)
inplace_quick_sort(s, indexes, left+1, end)
def customCmp(a, b):
return a > b
def getPivot(s, start, end):
return s[end]
if __name__ == '__main__':
arr = [1.5,1.2,1.0,1.0,1.2,1.2,1.5,1.3,2.0,0.7,0.2,1.4,1.2,1.8,2.0,2.1]
indexes = [i for i in range(len(arr))]
inplace_quick_sort(arr,indexes, 0, len(arr)-1)
print("sorted = {}".format(arr))
ref = [0]*len(indexes)
for i in range(len(indexes)):
#the core point of Matt Timmermans' answer about how to construct the ref
#the value of indexes[i] is index of the orignal array
#and i is the index of the sorted array,
#so we get the map by ref[indexes[i]] = i
ref[indexes[i]] = i
unsorted = [arr[ref[i]] for i in range(len(ref))]
print("unsorted after sorting = {}".format(unsorted))
It's not that horrible: you've merely reversed your reference usage. Your indices, ref, tell you how to build the sorted list from the original. However, you've used it in the opposite direction: you've applied it to the sorted list, trying to reconstruct the original. You need the inverse mapping.
Is that enough to get you to solve your problem?
I think you can just repair your ref array after the fact. From your code sample, just insert the following snippet after the call toquick_sort(a,b)
c = range(1, len(b)+1)
for i in range(0, len(b)):
c[ b[i]-1 ] = i+1
The c array should now contain the correct references.
Stealing/rewording what #Prune writes: what you have in b is the forward transformation, the sorting itself. Applying it to a0 provides the sorted list (print_links(a0,b))
You just have to revert it via looking up which element went to what position:
c=[b.index(i)+1 for i in range(1,len(a)+1)]
print_links(a,c)

Find intersection of numpy float arrays

How can I find the intersection of two numpy float arrays?:
a = np.arange(2, 3, 0.1)
b = np.array([2.3, 2.4, 2.5])
out_data = np.intersect1d(a, b)
the result is
out_data -> ndarray: []
Because of the way floats work, in your example a[3] is not 2.3, but 2.3000000000000003. This is because 0.1 does not have an exact representation in IEEE double precision floats. The intersect1d method in numpy is really only well suited for integers. To solve this, you should implement your own method that takes in a tolerance to decide if two floats are sufficiently close.
Here's a vectorized approach using NumPy's broadcasting capability -
tol = 1e-5 # tolerance
out = b[(np.abs(a[:,None] - b) < tol).any(0)]
Sample run -
In [31]: a
Out[31]: array([ 2. , 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9])
In [32]: b
Out[32]: array([ 2.3 , 2.4 , 2.5 , 2.25, 2.1 ])
In [33]: tol = 1e-5 # tolerance
In [34]: b[(np.abs(a[:,None] - b) < tol).any(0)]
Out[34]: array([ 2.3, 2.4, 2.5, 2.1])
Putting my comments in function form (assuming both lists are sorted, which you should do ahead of time):
import numpy as np
from itertools import islice
def findOverlap(self, a, b, rtol = 1e-05, atol = 1e-08):
ovr_a = []
ovr_b = []
start_b = 0
for i, ai in enumerate(a):
for j, bj in islice(enumerate(b), start_b, None):
if np.isclose(ai, bj, rtol=rtol, atol=atol, equal_nan=False):
ovr_a.append(i)
ovr_b.append(j)
elif bj > ai: # (more than tolerance)
break # all the rest will be farther away
else: # bj < ai (more than tolerance)
start_b += 1 # ignore further tests of this item
return (ovr_a, ovr_b)
EDIT: getting rid of equal_nan -- if you're going to sort you may as well ditch the nans
EDIT: using islice instead of array slice
EDIT: fixed bug
The following routine will return the indexes of common values within specified tolerance(s) relative to list a.
def findOverlap(self, a, b, rtol = 1e-05, atol = 1e-08, equal_nan = False):
overlap_indexes = []
for i, item_a in enumerate(a):
for item_b in b:
if np.isclose(item_a, item_b, rtol = rtol, atol = atol, equal_nan = equal_nan):
overlap_indexes.append(i)
return overlap_indexes
eg
a = np.arange(2, 3, 0.1).tolist()
b = np.array([2.3, 2.4, 2.5]).tolist()
self.findOverlap(a, b)
-> overlap_indexes:[3, 4, 5]

Categories