I am trying to find a way to create a function that passes two arrays, where the result is an array of the indices where the values from the first array will be located in the second array. The code below gives the result I want, but I am trying to get rid of the for loop and find a way to vectorize it using numpy functions:
x_array = np.array([25, 32, 3, 99, 300])
y_array = np.array([30, 33, 56, 99, 250])
result = [0, 1, 0, 3, -1]
def get_index(x_array, y_array):
result = []
for x in x_array:
index = np.where(x <= y_array)[0]
if index.size != 0:
result.append(index.min())
else:
result.append(-1)
return result
You are looking for np.searchsorted:
indices = np.searchsorted(y_array, x_array)
The only difference is that this returns the size of the array if you exceed the maximum element:
>>> indices
array([0, 1, 0, 3, 5], dtype=int64)
If you need to get -1 instead, you can use np.where or direct masking:
indices = np.where(indices < y_array.size, indices, -1)
OR
indices[indices >= y_array.size] = -1
Related
import numpy as np
X = [-10000, -1000, -100, -10, -1, 0, 1, 10, 100, 1000, 10000]
l = np.where(np.array(X) > 100)[0]
So i have
l = array([ 9, 10], dtype=int64)
Now I want to get a new Array X with the elements of l as indices. I want to get:
X = [1000, 10000]
I thought of:
X = X[l]
But it does not work. What is the proper function to use in this case? I don't want to use a for loop.
You need to convert your list X into a numpy array before you can index it with a list of indices:
>>> X = np.array(X)[l]
>>> X
array([ 1000, 10000])
You should convert list X to numpy array
x=np.array(X)
and then use indexing
x=x[x>100]
This will get the result you need without using where() function
"x>100" expression creates a boolean array with True elements corresponding to the elements of x array which are larger than 100. This array can be used as index to extract elements satisfying the condition.
Let's say I have a two-dimensional array
import numpy as np
a = np.array([[1, 1, 1], [2,2,2], [3,3,3]])
and I would like to replace the third vector (in the second dimension) with zeros. I would do
a[:, 2] = np.array([0, 0, 0])
But what if I would like to be able to do that programmatically? I mean, let's say that variable x = 1 contained the dimension on which I wanted to do the replacing. How would the function replace(arr, dimension, value, arr_to_be_replaced) have to look if I wanted to call it as replace(a, x, 2, np.array([0, 0, 0])?
numpy has a similar function, insert. However, it doesn't replace at dimension i, it returns a copy with an additional vector.
All solutions are welcome, but I do prefer a solution that doesn't recreate the array as to save memory.
arr[:, 1]
is basically shorthand for
arr[(slice(None), 1)]
that is, a tuple with slice elements and integers.
Knowing that, you can construct a tuple of slice objects manually, adjust the values depending on an axis parameter and use that as your index. So for
import numpy as np
arr = np.array([[1, 1, 1], [2, 2, 2], [3, 3, 3]])
axis = 1
idx = 2
arr[:, idx] = np.array([0, 0, 0])
# ^- axis position
you can use
slices = [slice(None)] * arr.ndim
slices[axis] = idx
arr[tuple(slices)] = np.array([0, 0, 0])
suppose x = np.array([[30,60,70],[100,20,80]]) and i wish to remove all elements that are <60. That is, the resulting array should be x = np.array([[60,70],[100,80]]).
I use x = np.array([[30,60,70],[100,20,80]]) to find the indices of the needed elements. And I get indices = (array([0, 1]), array([0, 1])). However, when I am trying to delete the elements in x via np.delete(x, indices), i get array([ 70, 100, 20, 80]) rather than what i was hoping.
What can I do to achieve the desired result?
import numpy as np
x = np.array([[30, 60, 70],
[100, 20, 80]])
new_x = np.array([(np.delete(i, np.where(i < 60)[0])) for i in x])
print(new_x)
Got it this way but idk if works too slow for large arrays
import numpy as np
d = np.array([
[30,60,70],
[100, 20, 80]
])
f = lambda x: x > 60
a = np.array([a[f(a)] for a in d])
print(a)
I pass an array of size (734,814,3) to a function but numpy.where() gives one dimensional result instead of the two-dimensional one, which it should for a 2D array
def hsi2rgb(img):
img_rgb = np.empty_like(img)
h = img[:,:,0] #(734,814)
s = img[:,:,1] #(734,814)
i = img[:,:,2] #(734,814)
l1 = 0.00
l2 = 2*3.14/3
l3 = 4*3.14/3
l4 = 3.14
r1 = np.where(np.logical_and(h>=l1, h<l2)) #(99048,)
r2 = np.where(np.logical_and(h>=l2, h<l3))
r3 = np.where(np.logical_and(h>=l3, h<l4))
hs = h[r1]
return img_rgb
r1 is shown to be a tupple, and r1[0],r1[1] are of the size 99048, which shouldn't be the case. r1 should have row indices and column indices of those values which satisfy the condition. I tried it without the logical and, using just one condition, but the problem persists.
I followed your code, and np.where returned the expected result: a tuple with two 1D arrays containing the indexes where the condition is met:
import numpy as np
h = np.random.uniform(size=(734, 814))
r1 = np.where(np.logical_and(h >= 0.1, h < 0.9))
print(r1[0].shape, r1[1].shape) # (478129,) (478129,)
This means that 478129 elements met the condition. For each of them, r1[0] will have its row index, and r11 will have its column index. Namely, if r1 looks like
(array([ 0, 0, 0, ..., 733, 733, 733]), array([ 0, 1, 2, ..., 808, 809, 811]))
then I know that h[0, 0], h[0, 1], h[0, 2], etc satisfy the conditions: the row index comes from the first array, the column index from the second. This structure may be less readable, but it's usable for indexing the array h.
The transposed form of the output is more readable, being a 2D array with row-column index pairs:
array([[ 0, 0],
[ 0, 1],
[ 0, 2],
...,
[733, 808],
[733, 809],
[733, 811]])
It can be obtained by transposing r1 (if you need the original r1 as well), or directly with np.argwhere:
r1 = np.argwhere(np.logical_and(h >= 0.1, h < 0.9))
I have two arrays, and I have a complex condition like this: new_arr<0 and old_arr>0
I am using nonzero but I am getting an error. The code I have is this:
indices = nonzero(new_arr<0 and old_arr>0)
I tried:
indices = nonzero(new_arr<0) and nonzero(old_arr>0)
But it gave me incorrect results.
Is there any way around this? And is there a way to get the common indices from two nonzero statements. For example, if:
indices1 = nonzero(new_arr<0)
indices2 = nonzero(old_arr>0)
and these two indices would contain:
indices1 = array([0, 1, 3])
indices2 = array([2, 3, 4])
The correct result would be getting the common element from these two (in this case it would be the element 3). Something like this:
result = common(indices1, indices2)
Try indices = nonzero((new_arr < 0) & (old_arr > 0)):
In [5]: import numpy as np
In [6]: old_arr = np.array([ 0,-1, 0,-1, 1, 1, 0, 1])
In [7]: new_arr = np.array([ 1, 1,-1,-1,-1,-1, 1, 1])
In [8]: np.nonzero((new_arr < 0) & (old_arr > 0))
Out[8]: (array([4, 5]),)
Try
indices = nonzero(logical_and(new < 0, old > 0))
(Thinking about it, my previous example wasn't all that useful if all it did was return nonzero(condition) anyway.)