import numpy as np
X = [-10000, -1000, -100, -10, -1, 0, 1, 10, 100, 1000, 10000]
l = np.where(np.array(X) > 100)[0]
So i have
l = array([ 9, 10], dtype=int64)
Now I want to get a new Array X with the elements of l as indices. I want to get:
X = [1000, 10000]
I thought of:
X = X[l]
But it does not work. What is the proper function to use in this case? I don't want to use a for loop.
You need to convert your list X into a numpy array before you can index it with a list of indices:
>>> X = np.array(X)[l]
>>> X
array([ 1000, 10000])
You should convert list X to numpy array
x=np.array(X)
and then use indexing
x=x[x>100]
This will get the result you need without using where() function
"x>100" expression creates a boolean array with True elements corresponding to the elements of x array which are larger than 100. This array can be used as index to extract elements satisfying the condition.
Related
I have an n row, m column numpy array, and would like to create a new k x m array by selecting k random elements from each column of the array. I wrote the following python function to do this, but would like to implement something more efficient and faster:
def sample_array_cols(MyMatrix, nelements):
vmat = []
TempMat = MyMatrix.T
for v in TempMat:
v = np.ndarray.tolist(v)
subv = random.sample(v, nelements)
vmat = vmat + [subv]
return(np.array(vmat).T)
One question is whether there's a way to loop over each column without transposing the array (and then transposing back). More importantly, is there some way to map the random sample onto each column that would be faster than having a for loop over all columns? I don't have that much experience with numpy objects, but I would guess that there should be something analogous to apply/mapply in R that would work?
One alternative is to randomly generate the indices first, and then use take_along_axis to map them to the original array:
arr = np.random.randn(1000, 5000) # arbitrary
k = 10 # arbitrary
n, m = arr.shape
idx = np.random.randint(0, n, (k, m))
new = np.take_along_axis(arr, idx, axis=0)
Output (shape):
in [215]: new.shape
out[215]: (10, 500) # (k x m)
To sample each column without replacement just like your original solution
import numpy as np
matrix = np.arange(4*3).reshape(4,3)
matrix
Output
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
k = 2
np.take_along_axis(matrix, np.random.rand(*matrix.shape).argsort(axis=0)[:k], axis=0)
Output
array([[ 9, 1, 2],
[ 3, 4, 11]])
I would
Pre-allocate the result array, and fill in columns, and
Use numpy index based indexing
def sample_array_cols(matrix, n_result):
(n,m) = matrix.shape
vmat = numpy.array([n_result, m], dtype= matrix.dtype)
for c in range(m):
random_indices = numpy.random.randint(0, n, n_result)
vmat[:,c] = matrix[random_indices, c]
return vmat
Not quite fully vectorized, but better than building up a list, and the code scans just like your description.
I am trying to find a way to create a function that passes two arrays, where the result is an array of the indices where the values from the first array will be located in the second array. The code below gives the result I want, but I am trying to get rid of the for loop and find a way to vectorize it using numpy functions:
x_array = np.array([25, 32, 3, 99, 300])
y_array = np.array([30, 33, 56, 99, 250])
result = [0, 1, 0, 3, -1]
def get_index(x_array, y_array):
result = []
for x in x_array:
index = np.where(x <= y_array)[0]
if index.size != 0:
result.append(index.min())
else:
result.append(-1)
return result
You are looking for np.searchsorted:
indices = np.searchsorted(y_array, x_array)
The only difference is that this returns the size of the array if you exceed the maximum element:
>>> indices
array([0, 1, 0, 3, 5], dtype=int64)
If you need to get -1 instead, you can use np.where or direct masking:
indices = np.where(indices < y_array.size, indices, -1)
OR
indices[indices >= y_array.size] = -1
How to convert a list into a numpy 2D ndarray. For example:
lst = [20, 30, 40, 50, 60]
Expected result:
>> print(arr)
>> array([[20],
[30],
[40],
[50],
[60]])
>> print(arr.shape)
>> (5, 1)
Convert it to array and reshape:
x = np.array(x).reshape(-1,1)
reshape adds the column structure. The -1 in reshape takes care of the correct number of rows it requires to reshape.
output:
[[20]
[30]
[40]
[50]
[60]]
If you need your calculations more effective, use numpy arrays instead of list comprehensions. This is an alternative way using array broadcasting
x = [20, 30, 40, 50, 60]
x = np.array(x) #convert your list to numpy array
result = x[:, None] #use numpy broadcasting
if you still need a list type at the end, you can convert your result efficiently using result.tolist()
You may use a list comprehension and then convert it to numpy array:
import numpy as np
x = [20, 30, 40, 50, 60]
x_nested = [[item] for item in x]
x_numpy = np.array(x_nested)
suppose x = np.array([[30,60,70],[100,20,80]]) and i wish to remove all elements that are <60. That is, the resulting array should be x = np.array([[60,70],[100,80]]).
I use x = np.array([[30,60,70],[100,20,80]]) to find the indices of the needed elements. And I get indices = (array([0, 1]), array([0, 1])). However, when I am trying to delete the elements in x via np.delete(x, indices), i get array([ 70, 100, 20, 80]) rather than what i was hoping.
What can I do to achieve the desired result?
import numpy as np
x = np.array([[30, 60, 70],
[100, 20, 80]])
new_x = np.array([(np.delete(i, np.where(i < 60)[0])) for i in x])
print(new_x)
Got it this way but idk if works too slow for large arrays
import numpy as np
d = np.array([
[30,60,70],
[100, 20, 80]
])
f = lambda x: x > 60
a = np.array([a[f(a)] for a in d])
print(a)
i have 3 numpy arrays which store image data of shape (4,100,100).
arr1= np.load(r'C:\Users\x\Desktop\py\output\a1.npy')
arr2= np.load(r'C:\Users\x\Desktop\py\output\a2.npy')
arr3= np.load(r'C:\Users\x\Desktop\py\output\a3.npy')
I want to merge all 3 arrays into 1 array.
I have tried in this way:
merg_arr = np.zeros((len(arr1)+len(arr2)+len(arr3), 4,100,100), dtype=input_img.dtype)
now this make an array of the required length but I don't know how to copy all the data in this array. may be using a loop?
This will do the trick:
merge_arr = np.concatenate([arr1, arr2, arr3], axis=0)
np.stack arranges arrays along a new dimension. Their dimensions (except for the first) need to match.
Demo:
arr1 = np.empty((60, 4, 10, 10))
arr2 = np.empty((14, 4, 10, 10))
arr3 = np.empty((6, 4, 10, 10))
merge_arr = np.concatenate([arr1, arr2, arr3], axis=0)
print(merge_arr.shape) # (80, 4, 10, 10)