Random indices from masked array

Random indices from masked array - python

I have a 2D MaskedArray X and I want to randomly select 30 non-masked elements from it and return their indices idx.
The goal is that I could use the indices to read / set values efficiently later in my code:
selected = X[idx]
X[idx] = a # some arrays with the same length
What is the most efficient way of generating idx?

Ok I have figured out a way... if anyone has a better approach please let me know.
pos = np.random.choice(X.count(), size=30)
idx = tuple(np.take((~X.mask).nonzero(), pos, axis=1))

I solved a similar task, by passing true/false of the array mask as weights to np.random.choice:
import numpy.ma as ma
import numpy as np
data = np.array([[0,0,0,1],[0,1,3,2],[2,0,0,3],[0,3,4,1]])
numSample=2
masked = ma.masked_where(data<3, data)
weights=~masked.mask + 0 #Assign False = 0, True = 1
normalized = weights.ravel()/float(weights.sum())
index=np.random.choice(
masked.size,
size=numSample,
replace=False,
p=normalized
)
idx, idy = np.unravel_index(index, data.shape)

Related

Random sample from specific rows and columns of a 2d numpy array (essentially sampling by ignoring edge effects)

I have a 2d numpy array size 100 x 100.
I want to randomly sample values from the "inside" 80 x 80 values so that I can exclude values which are influenced by edge effects. I want to sample from row 10 to row 90 and within that from column 10 to column 90.
However, importantly, I need to retain the original index values from the 100 x 100 grid, so I can't just trim the dataset and move on. If I do that, I am not really solving the edge effect problem because this is occurring within a loop with multiple iterations.
gridsize = 100
new_abundances = np.zeros([100,100],dtype=np.uint8)
min_select = int(np.around(gridsize * 0.10))
max_select = int(gridsize - (np.around(gridsize * 0.10)))
row_idx =np.arange(min_select,max_select)
col_idx = np.arange(min_select,max_select)
indices_random = ????? Somehow randomly sample from new_abundances only within the rows and columns of row_idx and col_idx set.
What I ultimately need is a list of 250 random indices selected from within the flattened new_abundances array. I need to keep the new_abundances array as 2d to identify the "edges" but once that is done, I need to flatten it to get the indices which are randomly selected.
Desired output:
An 1d list of indices from a flattened new_abundances array.

Woudl something like solve your problem?
import numpy as np
np.random.seed(0)
mat = np.random.random(size=(100,100))
x_indices = np.random.randint(low=10, high=90, size=250)
y_indices = np.random.randint(low=10, high=90, size=250)
coordinates = list(zip(x_indices,y_indices))
flat_mat = mat.flatten()
flat_index = x_indices * 100 + y_indices
Then you can access elements using any value from the coordinates list, e.g. mat[coordinates[0]] returns the the matrix value at coordinates[0]. Value of coordinates[0] is (38, 45) in my case. If the matrix is flattened, you can calculate the 1D index of the corresponding element. In this case, mat[coordinates[0]] == flat_mat[flat_index[0]] holds, where flat_index[0]==3845=100*38+45
Please also note that multiple sampling of the original data is possible this way.
Using your notation:
import numpy as np
np.random.seed(0)
gridsize = 100
new_abundances = np.zeros([100,100],dtype=np.uint8)
min_select = int(np.around(gridsize * 0.10))
max_select = int(gridsize - (np.around(gridsize * 0.10)))
x_indices = np.random.randint(low=min_select, high=max_select, size=250)
y_indices = np.random.randint(low=min_select, high=max_select, size=250)
coords = list(zip(x_indices,y_indices))
flat_new_abundances = new_abundances.flatten()
flat_index = x_indices * gridsize + y_indices

fast way to find the index of the closest from an array to each element in another array

I want to find a faster way to get the index of the nearest element in array of shape (k,l) to each element in another array of shape (n,l), I found 2 solutions but I think the performance can be improved.
here is an example
import numpy as np
import numpy.matlib as ml
frames = np.random.random([1000,2])
codeBook = np.random.random([8,2])
I
dist = np.zeros([frames .shape[0],codeBook.shape[0]])
for i in range(8):
difference = frames - ml.repmat(codeBook[i,:],frames.shape[0],1)
dist[:,i] = np.sqrt(np.sum(difference**2,1))
idx = np.argmin(dist,axis=1)
II
diffToCB = frames - np.rot90(ml.repmat(codeBook,frames.shape[0],1).reshape(-1,8,2),axes=(1,0))
idx = np.argmin(np.sqrt(np.einsum('ijk,ijk->ij', diffToCB, diffToCB)) , axis=0)

how to randomly sample in 2D matrix in numpy

I have a 2d array/matrix like this, how would I randomly pick the value from this 2D matrix, for example getting value like [-62, 29.23]. I looked at the numpy.choice but it is built for 1d array.
The following is my example with 4 rows and 8 columns
Space_Position=[
[[-62,29.23],[-49.73,29.23],[-31.82,29.23],[-14.2,29.23],[3.51,29.23],[21.21,29.23],[39.04,29.23],[57.1,29.23]],
[[-62,11.28],[-49.73,11.28],[-31.82,11.28],[-14.2,11.28],[3.51,11.28],[21.21,11.28] ,[39.04,11.28],[57.1,11.8]],
[[-62,-5.54],[-49.73,-5.54],[-31.82,-5.54] ,[-14.2,-5.54],[3.51,-5.54],[21.21,-5.54],[39.04,-5.54],[57.1,-5.54]],
[[-62,-23.1],[-49.73,-23.1],[-31.82,-23.1],[-14.2,-23.1],[3.51,-23.1],[21.21,-23.1],[39.04,-23.1] ,[57.1,-23.1]]
]
In the answers the following solution was given:
random_index1 = np.random.randint(0, Space_Position.shape[0])
random_index2 = np.random.randint(0, Space_Position.shape[1])
Space_Position[random_index1][random_index2]
this indeed works to give me one sample, how about more than one sample like what np.choice() does?
Another way I am thinking is to tranform the matrix into a array instead of matrix like,
Space_Position=[
[-62,29.23],[-49.73,29.23],[-31.82,29.23],[-14.2,29.23],[3.51,29.23],[21.21,29.23],[39.04,29.23],[57.1,29.23], ..... ]
and at last use np.choice(), however I could not find the ways to do the transformation, np.flatten() makes the array like
Space_Position=[-62,29.23,-49.73,29.2, ....]

Just use a random index (in your case 2 because you have 3 dimensions):
import numpy as np
Space_Position = np.array(Space_Position)
random_index1 = np.random.randint(0, Space_Position.shape[0])
random_index2 = np.random.randint(0, Space_Position.shape[1])
Space_Position[random_index1, random_index2] # get the random element.
The alternative is to actually make it 2D:
Space_Position = np.array(Space_Position).reshape(-1, 2)
and then use one random index:
Space_Position = np.array(Space_Position).reshape(-1, 2) # make it 2D
random_index = np.random.randint(0, Space_Position.shape[0]) # generate a random index
Space_Position[random_index] # get the random element.
If you want N samples with replacement:
N = 5
Space_Position = np.array(Space_Position).reshape(-1, 2) # make it 2D
random_indices = np.random.randint(0, Space_Position.shape[0], size=N) # generate N random indices
Space_Position[random_indices] # get N samples with replacement
or without replacement:
Space_Position = np.array(Space_Position).reshape(-1, 2) # make it 2D
random_indices = np.arange(0, Space_Position.shape[0]) # array of all indices
np.random.shuffle(random_indices) # shuffle the array
Space_Position[random_indices[:N]] # get N samples without replacement

Refering to numpy.random.choice:
Sampling random rows from a 2-D array is not possible with this function, but is possible with Generator.choice through its axis keyword.
The genrator documentation is linked here numpy.random.Generator.choice.
Using this knowledge. You can create a generator and then "choice" from your array:
rng = np.random.default_rng() #creates the generator ==> Generator(PCG64) at 0x2AA703BCE50
N = 3 #Number of Choices
a = np.array(Space_Position) #makes sure, a is an ndarray and numpy-supported
s = a.shape #(4,8,2)
a = a.reshape((s[0] * s[1], s[2])) #makes your array 2 dimensional keeping the last dimension seperated
a.shape #(32, 2)
b = rng.choice(a, N, axis=0, replace=False) #returns N choices of a in array b, e.g. narray([[ 57.1 , 11.8 ], [ 21.21, -5.54], [ 39.04, 11.28]])
#Note: replace=False prevents having the same entry several times in the result

Space_Position[np.random.randint(0, len(Space_Position))]
[np.random.randint(0, len(Space_Position))]
gives you what you want

Inverted fancy indexing

Having an array and a mask for this array, using fancy indexing, it is easy to select only the data of the array corresponding to the mask.
import numpy as np
a = np.arange(20).reshape(4, 5)
mask = [0, 2]
data = a[:, mask]
But is there a rapid way to select all the data of the array that does not belong to the mask (i.e. the mask is the data we want to reject)?
I tried to find a general solution going through an intermediate boolean array, but I'm sure there is something really easier.
mask2 = np.ones(a.shape)==1
mask2[:, mask]=False
data = a[mask2].reshape(a.shape[0], a.shape[1]-size(mask))
Thank you

Have a look at numpy.invert, numpy.bitwise_not, numpy.logical_not, or more concisely ~mask. (They all do the same thing, in this case.)
As a quick example:
import numpy as np
x = np.arange(10)
mask = x > 5
print x[mask]
print x[~mask]

How to get the index of a maximum element in a NumPy array along one axis

I have a 2 dimensional NumPy array. I know how to get the maximum values over axes:
>>> a = array([[1,2,3],[4,3,1]])
>>> amax(a,axis=0)
array([4, 3, 3])
How can I get the indices of the maximum elements? I would like as output array([1,1,0]) instead.

>>> a.argmax(axis=0)
array([1, 1, 0])

>>> import numpy as np
>>> a = np.array([[1,2,3],[4,3,1]])
>>> i,j = np.unravel_index(a.argmax(), a.shape)
>>> a[i,j]
4

argmax() will only return the first occurrence for each row.
http://docs.scipy.org/doc/numpy/reference/generated/numpy.argmax.html
If you ever need to do this for a shaped array, this works better than unravel:
import numpy as np
a = np.array([[1,2,3], [4,3,1]]) # Can be of any shape
indices = np.where(a == a.max())
You can also change your conditions:
indices = np.where(a >= 1.5)
The above gives you results in the form that you asked for. Alternatively, you can convert to a list of x,y coordinates by:
x_y_coords = zip(indices[0], indices[1])

There is argmin() and argmax() provided by numpy that returns the index of the min and max of a numpy array respectively.
Say e.g for 1-D array you'll do something like this
import numpy as np
a = np.array([50,1,0,2])
print(a.argmax()) # returns 0
print(a.argmin()) # returns 2
And similarly for multi-dimensional array
import numpy as np
a = np.array([[0,2,3],[4,30,1]])
print(a.argmax()) # returns 4
print(a.argmin()) # returns 0
Note that these will only return the index of the first occurrence.

v = alli.max()
index = alli.argmax()
x, y = index/8, index%8

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Random indices from masked array - python

Ok I have figured out a way... if anyone has a better approach please let me know. pos = np.random.choice(X.count(), size=30) idx = tuple(np.take((~X.mask).nonzero(), pos, axis=1))

Related

Random sample from specific rows and columns of a 2d numpy array (essentially sampling by ignoring edge effects)

fast way to find the index of the closest from an array to each element in another array

how to randomly sample in 2D matrix in numpy

Inverted fancy indexing

How to get the index of a maximum element in a NumPy array along one axis

Categories

Resources