Numpy.array indexing - python

import numpy as np
arr = np.array([[0, 1, 0],
[1, 0, 0],
[1, 0, 0]])
mask = arr
print('boolean mask is:')
print(mask)
print('arr[mask] is:')
print(arr[mask])
Result:
boolean mask is:
[[0 1 0]
[1 0 0]
[1 0 0]]
arr[mask] is:
[[[0 1 0]
[1 0 0]
[0 1 0]]
[[1 0 0]
[0 1 0]
[0 1 0]]
[[1 0 0]
[0 1 0]
[0 1 0]]]
I know how indexing works when the mask is 2-D, but confused when the mask is 3-D.
Anyone can explain it?

import numpy as np
l = [[0,1,2],[3,5,4],[7,8,9]]
arr = np.array(l)
mask = arr[:,:] > 5
print(mask) # shows boolean results
print(mask.sum()) # shows how many items are > 5
print(arr[:,1]) # slicing
print(arr[:,2]) # slicing
print(arr[:, 0:3]) # slicing
output
[[False False False]
[False False False]
[ True True True]]
3
[1 5 8]
[2 4 9]
[[0 1 2]
[3 5 4]
[7 8 9]]

Related

What is the best way to index 3d matrix with vectors?

import jax.numpy as jnp
vectors and array are jnp.array(dtype=jnp.int32)
I have an array with shape [x, d, y] (3x3x3)
[[[0 0 0],
[0 0 0],
[0 0 0]],
[[0 0 0],
[0 0 0],
[0 0 0]],
[[0 0 0],
[0 0 0],
[0 0 0]]]
and vectors x = [2 0 3], y = [ 2 0 1], d = [0 0 1]
I want to have something like this by indexing but I tried and don't really know how, with jax.numpy.
[[[0 0 2],
[0 0 0],
[0 0 0]],
[[0 0 0],
[0 0 0],
[0 0 0]],
[[0 0 0],
[0 3 0],
[0 0 0]]]
Edit: I would like to specify that I wanted to put number from x with its index to the array but only when x > 0. I tried with boolean mask.
Something like this
mask = x > 0
array = array.at[mask, d, y].set(array[mask, d, y] + x)
You have a three-dimensional array, so you can index it with three arrays of indices. Since you want d and y to be associated with the second and third dimensions, you'll need to create another array of indices for the first dimension:
import jax.numpy as jnp
arr = jnp.zeros((3, 3, 3), dtype='int32')
x = jnp.array([2, 0, 3])
y = jnp.array([2, 0, 1])
d = jnp.array([0, 0, 1])
i = jnp.arange(len(x))
mask = x > 0
out = arr.at[i[mask], d[mask], y[mask]].set(x[mask])
print(out)
# [[[0 0 2]
# [0 0 0]
# [0 0 0]]
# [[0 0 0]
# [0 0 0]
# [0 0 0]]
# [[0 0 0]
# [0 3 0]
# [0 0 0]]]
In this case the result will be the same whether or not you use the mask (i.e. arr.at[i, d, y].set(x) will give the same result) but because your question explicitly specified that you only want to use values x > 0 I included it.

How to check some elements of a matrix and generate a set of matrices accordingly?

I'm trying to check elements of a 2D array (matrix) and generate a number of matrices (of equal size) depending on some conditions as below:
Consider my matrix:
x = [[1, 0, 2],[7, 0, 7],[1, 1, 1]]
I need to check for the (2) and generate two matrices where the position of (2) will be replaced by 0 and 1 respectively. I also need to check for the 7's and generate 3 combinations of the matrix with values of 7 being (0,1),(1,0),(1,1) respectively. This mean the total number of matrices generated are 6 as follows:
[[1, 0, 0],[0, 0, 1],[1, 1, 1]]
[[1, 0, 0],[1, 0, 0],[1, 1, 1]]
[[1, 0, 0],[1, 0, 1],[1, 1, 1]]
[[1, 0, 1],[0, 0, 1],[1, 1, 1]]
[[1, 0, 1],[1, 0, 0],[1, 1, 1]]
[[1, 0, 1],[1, 0, 1],[1, 1, 1]]
There can be more than 1 (2), and the position of 7's can be vertical or hirizental.
I've tried a naiive way just looping through x looking for 2's and appending:
for i in range(len(x)):
for j in range(len(x[0])):
if x[i][j] == 2:
inter[i][j] = 0
test.append(inter)
inter2[i][j] = 1
test.append(inter2)
But that only works if I have the value of 2 only. I've also tried converting to numpy array and using where() to find the indexes of 2's and 7's, but then don't know how that can be used to generate the required outcome. Any thoughts?
The conditions described are very vague. If I understand correctly, you want this:
sevens = [[0,1],[1,0],[1,1]]
twos = [0,1]
for i in twos:
for j in sevens:
m = x.copy()
m[m==2] = i
m[m==7] = j
print(m)
output:
[[1 0 0]
[0 0 1]
[1 1 1]]
[[1 0 0]
[1 0 0]
[1 1 1]]
[[1 0 0]
[1 0 1]
[1 1 1]]
[[1 0 1]
[0 0 1]
[1 1 1]]
[[1 0 1]
[1 0 0]
[1 1 1]]
[[1 0 1]
[1 0 1]
[1 1 1]]
UPDATE: per OP's comment for (2)s multiplicities:
x = np.array([[2, 0, 2],[7, 0, 7],[1, 1, 1]])
sevens = [[0,1],[1,0],[1,1]]
v = (x==2).sum()*([0,1],)
twos = np.array(np.meshgrid(*v)).T.reshape(-1,2)
for i in twos:
for j in sevens:
m = x.copy()
m[m==2] = i
m[m==7] = j
print(m)
output:
[[0 0 0]
[0 0 1]
[1 1 1]]
[[0 0 0]
[1 0 0]
[1 1 1]]
[[0 0 0]
[1 0 1]
[1 1 1]]
[[0 0 1]
[0 0 1]
[1 1 1]]
[[0 0 1]
[1 0 0]
[1 1 1]]
[[0 0 1]
[1 0 1]
[1 1 1]]
[[1 0 0]
[0 0 1]
[1 1 1]]
[[1 0 0]
[1 0 0]
[1 1 1]]
[[1 0 0]
[1 0 1]
[1 1 1]]
[[1 0 1]
[0 0 1]
[1 1 1]]
[[1 0 1]
[1 0 0]
[1 1 1]]
[[1 0 1]
[1 0 1]
[1 1 1]]

add field to memmaped numpy record array

With normal memmapped numpy arrays, you can "add" a new column by opening the memmap file with an additional column in the shape.
k = np.memmap('input', dtype='int32', shape=(10, 2), mode='r+', order='F')
k[:] = 1
l = np.memmap('input', dtype='int32', shape=(10, 3), mode='r+', order='F')
print(k)
print(l)
[[1 1]
[1 1]
[1 1]
[1 1]
[1 1]
[1 1]
[1 1]
[1 1]
[1 1]
[1 1]]
[[1 1 0]
[1 1 0]
[1 1 0]
[1 1 0]
[1 1 0]
[1 1 0]
[1 1 0]
[1 1 0]
[1 1 0]
[1 1 0]]
Is it possible to make a similar move with record arrays? Seems possible with rows, but can't find a way to do so with a new field, if the dtype has heterogeneous types.

Is there any way to run np.where on multiple values rather than just one?

I'm wondering if I have an image in a numpy array, say 250x250x3 (3 channels), is it possible to use np.where to quickly find out if any of the 250x250 arrays of size 3 are equal to [143, 255, 0] or another color represented by rgb and get a 250x250 bool array?
When I try it in code with a 4x4x3, I get a 3x3 array as a result and I'm not totally sure where that shape is coming from.
import numpy as np
test = np.arange(4,52).reshape(4,4,3)
print(np.where(test == [4,5,6]))
-------------------------------------------
Result:
array([[0, 0, 0],
[0, 0, 0],
[0, 1, 2]])
What I'm trying to get:
array([[1, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]])
Solution
You don't need np.where (or anything particularly complicated) at all. You can just make use of the power of boolean arrays:
print(np.all(test == [4,5,6], axis=-1).astype(int))
# output:
# [[1 0 0 0]
# [0 0 0 0]
# [0 0 0 0]
# [0 0 0 0]]
An equivalent alternative would be to use logical_and:
print(np.logical_and.reduce(test == [4,5,6], axis=-1).astype(int))
# output:
# [[1 0 0 0]
# [0 0 0 0]
# [0 0 0 0]
# [0 0 0 0]]
Heavy duty test
import numpy as np
np.random.seed(0)
# the subarray we'll search for
pattern = [143, 255, 0]
# generate a random test array
arr = np.random.randint(0, 255, size=(255,255,3))
# insert the pattern array at ~10000 random indices
ix = np.unique(np.random.randint(np.prod(arr.shape[:-1]), size=10000))
arr.reshape(-1, arr.shape[-1])[ix] = pattern
# find all instances of the pattern array (ignore partial matches)
loc = np.all(arr==pattern, axis=-1).astype(int)
# test that the found locs are equivalent to the test ixs
locix = np.ravel_multi_index(loc.nonzero(), arr.shape[:-1])
np.testing.assert_array_equal(np.sort(ix), np.sort(locix))
# test has been run, the above assert passes
For simplicity, let's say that we are looking for all locations where all 3 channels equal 1. This will do the job:
np.random.seed(0)
a=np.random.randint(0,2,(3,5,5))
print(a)
np.where((a[0]==1)*(a[1]==1)*(a[2]==1))
This outputs
[[[0 1 1 0 1]
[1 1 1 1 1]
[1 0 0 1 0]
[0 0 0 0 1]
[0 1 1 0 0]]
[[1 1 1 1 0]
[1 0 1 0 1]
[1 0 1 1 0]
[0 1 0 1 1]
[1 1 1 0 1]]
[[0 1 1 1 1]
[0 1 0 0 1]
[1 0 1 0 1]
[0 0 0 0 0]
[1 1 0 0 0]]]
(array([0, 0, 1, 2, 4], dtype=int64), array([1, 2, 4, 0, 1], dtype=int64))
And indeed there are 5 coordinates in which all 3 channels equal 1.
If you want to get a more easy to read representation, replace the last row with
tuple(zip(*np.where((a[0]==1)*(a[1]==1)*(a[2]==1))))
This will output
((0, 1), (0, 2), (1, 4), (2, 0), (4, 1))
which are all the 5 locations where all 3 channels equal 1.
Note that (a[0]==1)*(a[1]==1)*(a[2]==1) is just
array([[False, True, True, False, False],
[False, False, False, False, True],
[ True, False, False, False, False],
[False, False, False, False, False],
[False, True, False, False, False]])
the boolean representation that you were looking for.
If you want to get any other triplet, say [143, 255, 0], just use (a[0]==143)*(a[1]==255)*(a[2]==0).

python: Find subsets coordinate is in

I have a set of coordinates and try to find those subsets the coordinate is in.
import numpy as np
a=np.array([[[0,1,1],[1,1,1]],[[0,1,1],[2,1,1]],[[3,3,3],[2,2,2]]])
If I try things like:
print(np.argwhere(a==[[0,1,1]]))
print(a[[0,1,1]])
print(np.isin([0,1,1],a))
I get:
[[0 0 0]
[0 0 1]
[0 0 2]
[0 1 1]
[0 1 2]
[1 0 0]
[1 0 1]
[1 0 2]
[1 1 1]
[1 1 2]]
[[[0 1 1]
[1 1 1]]
[[0 1 1]
[2 1 1]]
[[0 1 1]
[2 1 1]]]
[ True True True]
But I expect something like:
[true,true,false]
EDIT
The best case would be If I get an array where only all other coordinates which are members of the founded subsets are in like:
out = [[1,1,1],[2,1,1]]
Use all(-1) to assert the array equal at the last axis and then any(1) to check if such condition exists at the second axis:
(a == [0,1,1]).all(-1).any(1)
# array([ True, True, False], dtype=bool)
On the update:
mask = (a == [0,1,1]).all(-1)
a[mask.any(1)[:,None] & ~mask]
#array([[1, 1, 1],
# [2, 1, 1]])
I got the results you're looking for by doing this:
[[0,1,1] in b for b in a]
I'll try to figure out why isin didnt work.

Categories