Fancy indexing across multiple dimensions in numpy [duplicate] - python

I've got a strange situation.
I have a 2D Numpy array, x:
x = np.random.random_integers(0,5,(20,8))
And I have 2 indexers--one with indices for the rows, and one with indices for the column. In order to index X, I am having to do the following:
row_indices = [4,2,18,16,7,19,4]
col_indices = [1,2]
x_rows = x[row_indices,:]
x_indexed = x_rows[:,column_indices]
Instead of just:
x_new = x[row_indices,column_indices]
(which fails with: error, cannot broadcast (20,) with (2,))
I'd like to be able to do the indexing in one line using the broadcasting, since that would keep the code clean and readable...also, I don't know all that much about python under the hood, but as I understand it, it should be faster to do it in one line (and I'll be working with pretty big arrays).
Test Case:
x = np.random.random_integers(0,5,(20,8))
row_indices = [4,2,18,16,7,19,4]
col_indices = [1,2]
x_rows = x[row_indices,:]
x_indexed = x_rows[:,col_indices]
x_doesnt_work = x[row_indices,col_indices]

Selections or assignments with np.ix_ using indexing or boolean arrays/masks
1. With indexing-arrays
A. Selection
We can use np.ix_ to get a tuple of indexing arrays that are broadcastable against each other to result in a higher-dimensional combinations of indices. So, when that tuple is used for indexing into the input array, would give us the same higher-dimensional array. Hence, to make a selection based on two 1D indexing arrays, it would be -
x_indexed = x[np.ix_(row_indices,col_indices)]
B. Assignment
We can use the same notation for assigning scalar or a broadcastable array into those indexed positions. Hence, the following works for assignments -
x[np.ix_(row_indices,col_indices)] = # scalar or broadcastable array
2. With masks
We can also use boolean arrays/masks with np.ix_, similar to how indexing arrays are used. This can be used again to select a block off the input array and also for assignments into it.
A. Selection
Thus, with row_mask and col_mask boolean arrays as the masks for row and column selections respectively, we can use the following for selections -
x[np.ix_(row_mask,col_mask)]
B. Assignment
And the following works for assignments -
x[np.ix_(row_mask,col_mask)] = # scalar or broadcastable array
Sample Runs
1. Using np.ix_ with indexing-arrays
Input array and indexing arrays -
In [221]: x
Out[221]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[88, 92, 46, 67, 44, 81, 17, 67],
[31, 70, 47, 90, 52, 15, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, 76, 56, 72, 43, 79, 53, 37],
[74, 46, 95, 27, 81, 97, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
In [222]: row_indices
Out[222]: [4, 2, 5, 4, 1]
In [223]: col_indices
Out[223]: [1, 2]
Tuple of indexing arrays with np.ix_ -
In [224]: np.ix_(row_indices,col_indices) # Broadcasting of indices
Out[224]:
(array([[4],
[2],
[5],
[4],
[1]]), array([[1, 2]]))
Make selections -
In [225]: x[np.ix_(row_indices,col_indices)]
Out[225]:
array([[76, 56],
[70, 47],
[46, 95],
[76, 56],
[92, 46]])
As suggested by OP, this is in effect same as performing old-school broadcasting with a 2D array version of row_indices that has its elements/indices sent to axis=0 and thus creating a singleton dimension at axis=1 and thus allowing broadcasting with col_indices. Thus, we would have an alternative solution like so -
In [227]: x[np.asarray(row_indices)[:,None],col_indices]
Out[227]:
array([[76, 56],
[70, 47],
[46, 95],
[76, 56],
[92, 46]])
As discussed earlier, for the assignments, we simply do so.
Row, col indexing arrays -
In [36]: row_indices = [1, 4]
In [37]: col_indices = [1, 3]
Make assignments with scalar -
In [38]: x[np.ix_(row_indices,col_indices)] = -1
In [39]: x
Out[39]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[88, -1, 46, -1, 44, 81, 17, 67],
[31, 70, 47, 90, 52, 15, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, -1, 56, -1, 43, 79, 53, 37],
[74, 46, 95, 27, 81, 97, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
Make assignments with 2D block(broadcastable array) -
In [40]: rand_arr = -np.arange(4).reshape(2,2)
In [41]: x[np.ix_(row_indices,col_indices)] = rand_arr
In [42]: x
Out[42]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[88, 0, 46, -1, 44, 81, 17, 67],
[31, 70, 47, 90, 52, 15, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, -2, 56, -3, 43, 79, 53, 37],
[74, 46, 95, 27, 81, 97, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
2. Using np.ix_ with masks
Input array -
In [19]: x
Out[19]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[88, 92, 46, 67, 44, 81, 17, 67],
[31, 70, 47, 90, 52, 15, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, 76, 56, 72, 43, 79, 53, 37],
[74, 46, 95, 27, 81, 97, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
Input row, col masks -
In [20]: row_mask = np.array([0,1,1,0,0,1,0],dtype=bool)
In [21]: col_mask = np.array([1,0,1,0,1,1,0,0],dtype=bool)
Make selections -
In [22]: x[np.ix_(row_mask,col_mask)]
Out[22]:
array([[88, 46, 44, 81],
[31, 47, 52, 15],
[74, 95, 81, 97]])
Make assignments with scalar -
In [23]: x[np.ix_(row_mask,col_mask)] = -1
In [24]: x
Out[24]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[-1, 92, -1, 67, -1, -1, 17, 67],
[-1, 70, -1, 90, -1, -1, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, 76, 56, 72, 43, 79, 53, 37],
[-1, 46, -1, 27, -1, -1, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
Make assignments with 2D block(broadcastable array) -
In [25]: rand_arr = -np.arange(12).reshape(3,4)
In [26]: x[np.ix_(row_mask,col_mask)] = rand_arr
In [27]: x
Out[27]:
array([[ 17, 39, 88, 14, 73, 58, 17, 78],
[ 0, 92, -1, 67, -2, -3, 17, 67],
[ -4, 70, -5, 90, -6, -7, 24, 22],
[ 19, 59, 98, 19, 52, 95, 88, 65],
[ 85, 76, 56, 72, 43, 79, 53, 37],
[ -8, 46, -9, 27, -10, -11, 93, 69],
[ 49, 46, 12, 83, 15, 63, 20, 79]])

What about:
x[row_indices][:,col_indices]
For example,
x = np.random.random_integers(0,5,(5,5))
## array([[4, 3, 2, 5, 0],
## [0, 3, 1, 4, 2],
## [4, 2, 0, 0, 3],
## [4, 5, 5, 5, 0],
## [1, 1, 5, 0, 2]])
row_indices = [4,2]
col_indices = [1,2]
x[row_indices][:,col_indices]
## array([[1, 5],
## [2, 0]])

import numpy as np
x = np.random.random_integers(0,5,(4,4))
x
array([[5, 3, 3, 2],
[4, 3, 0, 0],
[1, 4, 5, 3],
[0, 4, 3, 4]])
# This indexes the elements 1,1 and 2,2 and 3,3
indexes = (np.array([1,2,3]),np.array([1,2,3]))
x[indexes]
# returns array([3, 5, 4])
Notice that numpy has very different rules depending on what kind of indexes you use. So indexing several elements should be by a tuple of np.ndarray (see indexing manual).
So you need only to convert your list to np.ndarray and it should work as expected.

I think you are trying to do one of the following (equlvalent) operations:
x_does_work = x[row_indices,:][:,col_indices]
x_does_work = x[:,col_indices][row_indices,:]
This will actually create a subset of x with only the selected rows, then select the columns from that, or vice versa in the second case. The first case can be thought of as
x_does_work = (x[row_indices,:])[:,col_indices]

Your first try would work if you write it with np.newaxis
x_new = x[row_indices[:, np.newaxis],column_indices]

Related

How do I neatly select a numpy sub-array in 1 line? [duplicate]

I've got a strange situation.
I have a 2D Numpy array, x:
x = np.random.random_integers(0,5,(20,8))
And I have 2 indexers--one with indices for the rows, and one with indices for the column. In order to index X, I am having to do the following:
row_indices = [4,2,18,16,7,19,4]
col_indices = [1,2]
x_rows = x[row_indices,:]
x_indexed = x_rows[:,column_indices]
Instead of just:
x_new = x[row_indices,column_indices]
(which fails with: error, cannot broadcast (20,) with (2,))
I'd like to be able to do the indexing in one line using the broadcasting, since that would keep the code clean and readable...also, I don't know all that much about python under the hood, but as I understand it, it should be faster to do it in one line (and I'll be working with pretty big arrays).
Test Case:
x = np.random.random_integers(0,5,(20,8))
row_indices = [4,2,18,16,7,19,4]
col_indices = [1,2]
x_rows = x[row_indices,:]
x_indexed = x_rows[:,col_indices]
x_doesnt_work = x[row_indices,col_indices]
Selections or assignments with np.ix_ using indexing or boolean arrays/masks
1. With indexing-arrays
A. Selection
We can use np.ix_ to get a tuple of indexing arrays that are broadcastable against each other to result in a higher-dimensional combinations of indices. So, when that tuple is used for indexing into the input array, would give us the same higher-dimensional array. Hence, to make a selection based on two 1D indexing arrays, it would be -
x_indexed = x[np.ix_(row_indices,col_indices)]
B. Assignment
We can use the same notation for assigning scalar or a broadcastable array into those indexed positions. Hence, the following works for assignments -
x[np.ix_(row_indices,col_indices)] = # scalar or broadcastable array
2. With masks
We can also use boolean arrays/masks with np.ix_, similar to how indexing arrays are used. This can be used again to select a block off the input array and also for assignments into it.
A. Selection
Thus, with row_mask and col_mask boolean arrays as the masks for row and column selections respectively, we can use the following for selections -
x[np.ix_(row_mask,col_mask)]
B. Assignment
And the following works for assignments -
x[np.ix_(row_mask,col_mask)] = # scalar or broadcastable array
Sample Runs
1. Using np.ix_ with indexing-arrays
Input array and indexing arrays -
In [221]: x
Out[221]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[88, 92, 46, 67, 44, 81, 17, 67],
[31, 70, 47, 90, 52, 15, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, 76, 56, 72, 43, 79, 53, 37],
[74, 46, 95, 27, 81, 97, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
In [222]: row_indices
Out[222]: [4, 2, 5, 4, 1]
In [223]: col_indices
Out[223]: [1, 2]
Tuple of indexing arrays with np.ix_ -
In [224]: np.ix_(row_indices,col_indices) # Broadcasting of indices
Out[224]:
(array([[4],
[2],
[5],
[4],
[1]]), array([[1, 2]]))
Make selections -
In [225]: x[np.ix_(row_indices,col_indices)]
Out[225]:
array([[76, 56],
[70, 47],
[46, 95],
[76, 56],
[92, 46]])
As suggested by OP, this is in effect same as performing old-school broadcasting with a 2D array version of row_indices that has its elements/indices sent to axis=0 and thus creating a singleton dimension at axis=1 and thus allowing broadcasting with col_indices. Thus, we would have an alternative solution like so -
In [227]: x[np.asarray(row_indices)[:,None],col_indices]
Out[227]:
array([[76, 56],
[70, 47],
[46, 95],
[76, 56],
[92, 46]])
As discussed earlier, for the assignments, we simply do so.
Row, col indexing arrays -
In [36]: row_indices = [1, 4]
In [37]: col_indices = [1, 3]
Make assignments with scalar -
In [38]: x[np.ix_(row_indices,col_indices)] = -1
In [39]: x
Out[39]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[88, -1, 46, -1, 44, 81, 17, 67],
[31, 70, 47, 90, 52, 15, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, -1, 56, -1, 43, 79, 53, 37],
[74, 46, 95, 27, 81, 97, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
Make assignments with 2D block(broadcastable array) -
In [40]: rand_arr = -np.arange(4).reshape(2,2)
In [41]: x[np.ix_(row_indices,col_indices)] = rand_arr
In [42]: x
Out[42]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[88, 0, 46, -1, 44, 81, 17, 67],
[31, 70, 47, 90, 52, 15, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, -2, 56, -3, 43, 79, 53, 37],
[74, 46, 95, 27, 81, 97, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
2. Using np.ix_ with masks
Input array -
In [19]: x
Out[19]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[88, 92, 46, 67, 44, 81, 17, 67],
[31, 70, 47, 90, 52, 15, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, 76, 56, 72, 43, 79, 53, 37],
[74, 46, 95, 27, 81, 97, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
Input row, col masks -
In [20]: row_mask = np.array([0,1,1,0,0,1,0],dtype=bool)
In [21]: col_mask = np.array([1,0,1,0,1,1,0,0],dtype=bool)
Make selections -
In [22]: x[np.ix_(row_mask,col_mask)]
Out[22]:
array([[88, 46, 44, 81],
[31, 47, 52, 15],
[74, 95, 81, 97]])
Make assignments with scalar -
In [23]: x[np.ix_(row_mask,col_mask)] = -1
In [24]: x
Out[24]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[-1, 92, -1, 67, -1, -1, 17, 67],
[-1, 70, -1, 90, -1, -1, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, 76, 56, 72, 43, 79, 53, 37],
[-1, 46, -1, 27, -1, -1, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
Make assignments with 2D block(broadcastable array) -
In [25]: rand_arr = -np.arange(12).reshape(3,4)
In [26]: x[np.ix_(row_mask,col_mask)] = rand_arr
In [27]: x
Out[27]:
array([[ 17, 39, 88, 14, 73, 58, 17, 78],
[ 0, 92, -1, 67, -2, -3, 17, 67],
[ -4, 70, -5, 90, -6, -7, 24, 22],
[ 19, 59, 98, 19, 52, 95, 88, 65],
[ 85, 76, 56, 72, 43, 79, 53, 37],
[ -8, 46, -9, 27, -10, -11, 93, 69],
[ 49, 46, 12, 83, 15, 63, 20, 79]])
What about:
x[row_indices][:,col_indices]
For example,
x = np.random.random_integers(0,5,(5,5))
## array([[4, 3, 2, 5, 0],
## [0, 3, 1, 4, 2],
## [4, 2, 0, 0, 3],
## [4, 5, 5, 5, 0],
## [1, 1, 5, 0, 2]])
row_indices = [4,2]
col_indices = [1,2]
x[row_indices][:,col_indices]
## array([[1, 5],
## [2, 0]])
import numpy as np
x = np.random.random_integers(0,5,(4,4))
x
array([[5, 3, 3, 2],
[4, 3, 0, 0],
[1, 4, 5, 3],
[0, 4, 3, 4]])
# This indexes the elements 1,1 and 2,2 and 3,3
indexes = (np.array([1,2,3]),np.array([1,2,3]))
x[indexes]
# returns array([3, 5, 4])
Notice that numpy has very different rules depending on what kind of indexes you use. So indexing several elements should be by a tuple of np.ndarray (see indexing manual).
So you need only to convert your list to np.ndarray and it should work as expected.
I think you are trying to do one of the following (equlvalent) operations:
x_does_work = x[row_indices,:][:,col_indices]
x_does_work = x[:,col_indices][row_indices,:]
This will actually create a subset of x with only the selected rows, then select the columns from that, or vice versa in the second case. The first case can be thought of as
x_does_work = (x[row_indices,:])[:,col_indices]
Your first try would work if you write it with np.newaxis
x_new = x[row_indices[:, np.newaxis],column_indices]

Numpy: using entries of array to populate larger array with repeats [duplicate]

I've got a strange situation.
I have a 2D Numpy array, x:
x = np.random.random_integers(0,5,(20,8))
And I have 2 indexers--one with indices for the rows, and one with indices for the column. In order to index X, I am having to do the following:
row_indices = [4,2,18,16,7,19,4]
col_indices = [1,2]
x_rows = x[row_indices,:]
x_indexed = x_rows[:,column_indices]
Instead of just:
x_new = x[row_indices,column_indices]
(which fails with: error, cannot broadcast (20,) with (2,))
I'd like to be able to do the indexing in one line using the broadcasting, since that would keep the code clean and readable...also, I don't know all that much about python under the hood, but as I understand it, it should be faster to do it in one line (and I'll be working with pretty big arrays).
Test Case:
x = np.random.random_integers(0,5,(20,8))
row_indices = [4,2,18,16,7,19,4]
col_indices = [1,2]
x_rows = x[row_indices,:]
x_indexed = x_rows[:,col_indices]
x_doesnt_work = x[row_indices,col_indices]
Selections or assignments with np.ix_ using indexing or boolean arrays/masks
1. With indexing-arrays
A. Selection
We can use np.ix_ to get a tuple of indexing arrays that are broadcastable against each other to result in a higher-dimensional combinations of indices. So, when that tuple is used for indexing into the input array, would give us the same higher-dimensional array. Hence, to make a selection based on two 1D indexing arrays, it would be -
x_indexed = x[np.ix_(row_indices,col_indices)]
B. Assignment
We can use the same notation for assigning scalar or a broadcastable array into those indexed positions. Hence, the following works for assignments -
x[np.ix_(row_indices,col_indices)] = # scalar or broadcastable array
2. With masks
We can also use boolean arrays/masks with np.ix_, similar to how indexing arrays are used. This can be used again to select a block off the input array and also for assignments into it.
A. Selection
Thus, with row_mask and col_mask boolean arrays as the masks for row and column selections respectively, we can use the following for selections -
x[np.ix_(row_mask,col_mask)]
B. Assignment
And the following works for assignments -
x[np.ix_(row_mask,col_mask)] = # scalar or broadcastable array
Sample Runs
1. Using np.ix_ with indexing-arrays
Input array and indexing arrays -
In [221]: x
Out[221]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[88, 92, 46, 67, 44, 81, 17, 67],
[31, 70, 47, 90, 52, 15, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, 76, 56, 72, 43, 79, 53, 37],
[74, 46, 95, 27, 81, 97, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
In [222]: row_indices
Out[222]: [4, 2, 5, 4, 1]
In [223]: col_indices
Out[223]: [1, 2]
Tuple of indexing arrays with np.ix_ -
In [224]: np.ix_(row_indices,col_indices) # Broadcasting of indices
Out[224]:
(array([[4],
[2],
[5],
[4],
[1]]), array([[1, 2]]))
Make selections -
In [225]: x[np.ix_(row_indices,col_indices)]
Out[225]:
array([[76, 56],
[70, 47],
[46, 95],
[76, 56],
[92, 46]])
As suggested by OP, this is in effect same as performing old-school broadcasting with a 2D array version of row_indices that has its elements/indices sent to axis=0 and thus creating a singleton dimension at axis=1 and thus allowing broadcasting with col_indices. Thus, we would have an alternative solution like so -
In [227]: x[np.asarray(row_indices)[:,None],col_indices]
Out[227]:
array([[76, 56],
[70, 47],
[46, 95],
[76, 56],
[92, 46]])
As discussed earlier, for the assignments, we simply do so.
Row, col indexing arrays -
In [36]: row_indices = [1, 4]
In [37]: col_indices = [1, 3]
Make assignments with scalar -
In [38]: x[np.ix_(row_indices,col_indices)] = -1
In [39]: x
Out[39]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[88, -1, 46, -1, 44, 81, 17, 67],
[31, 70, 47, 90, 52, 15, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, -1, 56, -1, 43, 79, 53, 37],
[74, 46, 95, 27, 81, 97, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
Make assignments with 2D block(broadcastable array) -
In [40]: rand_arr = -np.arange(4).reshape(2,2)
In [41]: x[np.ix_(row_indices,col_indices)] = rand_arr
In [42]: x
Out[42]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[88, 0, 46, -1, 44, 81, 17, 67],
[31, 70, 47, 90, 52, 15, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, -2, 56, -3, 43, 79, 53, 37],
[74, 46, 95, 27, 81, 97, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
2. Using np.ix_ with masks
Input array -
In [19]: x
Out[19]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[88, 92, 46, 67, 44, 81, 17, 67],
[31, 70, 47, 90, 52, 15, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, 76, 56, 72, 43, 79, 53, 37],
[74, 46, 95, 27, 81, 97, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
Input row, col masks -
In [20]: row_mask = np.array([0,1,1,0,0,1,0],dtype=bool)
In [21]: col_mask = np.array([1,0,1,0,1,1,0,0],dtype=bool)
Make selections -
In [22]: x[np.ix_(row_mask,col_mask)]
Out[22]:
array([[88, 46, 44, 81],
[31, 47, 52, 15],
[74, 95, 81, 97]])
Make assignments with scalar -
In [23]: x[np.ix_(row_mask,col_mask)] = -1
In [24]: x
Out[24]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[-1, 92, -1, 67, -1, -1, 17, 67],
[-1, 70, -1, 90, -1, -1, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, 76, 56, 72, 43, 79, 53, 37],
[-1, 46, -1, 27, -1, -1, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
Make assignments with 2D block(broadcastable array) -
In [25]: rand_arr = -np.arange(12).reshape(3,4)
In [26]: x[np.ix_(row_mask,col_mask)] = rand_arr
In [27]: x
Out[27]:
array([[ 17, 39, 88, 14, 73, 58, 17, 78],
[ 0, 92, -1, 67, -2, -3, 17, 67],
[ -4, 70, -5, 90, -6, -7, 24, 22],
[ 19, 59, 98, 19, 52, 95, 88, 65],
[ 85, 76, 56, 72, 43, 79, 53, 37],
[ -8, 46, -9, 27, -10, -11, 93, 69],
[ 49, 46, 12, 83, 15, 63, 20, 79]])
What about:
x[row_indices][:,col_indices]
For example,
x = np.random.random_integers(0,5,(5,5))
## array([[4, 3, 2, 5, 0],
## [0, 3, 1, 4, 2],
## [4, 2, 0, 0, 3],
## [4, 5, 5, 5, 0],
## [1, 1, 5, 0, 2]])
row_indices = [4,2]
col_indices = [1,2]
x[row_indices][:,col_indices]
## array([[1, 5],
## [2, 0]])
import numpy as np
x = np.random.random_integers(0,5,(4,4))
x
array([[5, 3, 3, 2],
[4, 3, 0, 0],
[1, 4, 5, 3],
[0, 4, 3, 4]])
# This indexes the elements 1,1 and 2,2 and 3,3
indexes = (np.array([1,2,3]),np.array([1,2,3]))
x[indexes]
# returns array([3, 5, 4])
Notice that numpy has very different rules depending on what kind of indexes you use. So indexing several elements should be by a tuple of np.ndarray (see indexing manual).
So you need only to convert your list to np.ndarray and it should work as expected.
I think you are trying to do one of the following (equlvalent) operations:
x_does_work = x[row_indices,:][:,col_indices]
x_does_work = x[:,col_indices][row_indices,:]
This will actually create a subset of x with only the selected rows, then select the columns from that, or vice versa in the second case. The first case can be thought of as
x_does_work = (x[row_indices,:])[:,col_indices]
Your first try would work if you write it with np.newaxis
x_new = x[row_indices[:, np.newaxis],column_indices]

NumPy nearest value along axis of multidimensional array

I'd like to create a function, that returns the nearest value in the array along a specified axis to a given value.
To get the index of the nearest value I use the following code where arr is a multidimensional array and value is the value to look for:
def nearest_index( arr, value, axis=None ):
return ( np.abs( arr - value ) ).argmin( axis=axis )
But I struggle with using the result of this function to get the values from the array.
It is easy with 1D-arrays:
In [14]: arr_1 = np.random.randint( 10, 100, size=( 10, ) )
In [15]: arr_1
Out[15]: array([67, 49, 90, 29, 60, 80, 31, 55, 29, 10])
In [16]: nearest_index( arr_1, 50 )
Out[16]: 1
In [17]: arr_1[nearest_index( arr_1, 50 )]
Out[17]: 49
or with flattened arrays:
In [25]: arr_3 = np.random.randint( 10, 100, size=( 2, 3, 4, ) )
In [26]: arr_3
Out[26]:
array([[[85, 51, 74, 79],
[63, 42, 27, 75],
[89, 68, 80, 63]],
[[85, 72, 74, 16],
[85, 22, 47, 78],
[44, 70, 98, 34]]])
In [27]: idx_flat = nearest_index( arr_3, 50, axis=None )
In [28]: idx_flat
Out[28]: 1
In [29]: idx = np.unravel_index( idx_flat, arr_3.shape )
In [30]: idx
Out[30]: (0, 0, 1)
In [31]: arr_3[idx]
Out[31]: 51
How can I create a function, that returns the values along the defined axis?
I tried the solution for this Question, but I only got it working for axis=-1.
Note that it is not an issue to me, if only the first occurance of the result is found if multiple elements in the array are equally near the expected value.
For a multi-dimensional array, we need to use advanced-indexing. So, for a generic n-dim array and with a specified axis, we could do something like this -
def argmin_values_along_axis(arr, value, axis):
argmin_idx = np.abs(arr - value).argmin(axis=axis)
shp = arr.shape
indx = list(np.ix_(*[np.arange(i) for i in shp]))
indx[axis] = np.expand_dims(argmin_idx, axis=axis)
return np.squeeze(arr[indx])
Sample runs -
In [203]: arr_3 = np.random.randint( 10, 100, size=( 2, 3, 4, ) )
In [204]: arr_3
Out[204]:
array([[[94, 55, 26, 51],
[82, 66, 80, 66],
[96, 54, 93, 57]],
[[59, 28, 95, 56],
[47, 48, 17, 77],
[15, 57, 57, 25]]])
In [205]: argmin_values_along_axis(arr_3, value=50, axis=0)
Out[205]:
array([[59, 55, 26, 51],
[47, 48, 80, 66],
[15, 54, 57, 57]])
In [206]: argmin_values_along_axis(arr_3, value=50, axis=1)
Out[206]:
array([[82, 54, 26, 51],
[47, 48, 57, 56]])
In [207]: argmin_values_along_axis(arr_3, value=50, axis=2)
Out[207]:
array([[51, 66, 54],
[56, 48, 57]])
Well it works for me.
def nearest_index(arr, value, axis=None):
return np.argmin(np.abs( arr - value ), axis=axis)
>>> X
array([[76, 94, 56, 93, 28, 0, 44, 50, 89, 93],
[80, 99, 29, 98, 39, 27, 55, 70, 19, 76],
[87, 7, 28, 78, 47, 95, 34, 97, 66, 27],
[75, 78, 82, 30, 15, 0, 2, 25, 58, 69],
[31, 2, 34, 1, 56, 7, 87, 78, 32, 77],
[89, 80, 76, 97, 49, 18, 62, 35, 94, 41],
[ 2, 44, 83, 3, 64, 4, 49, 93, 46, 8],
[51, 63, 45, 57, 77, 90, 93, 4, 26, 81],
[43, 92, 22, 98, 93, 36, 46, 25, 35, 36],
[30, 14, 42, 91, 86, 14, 78, 9, 37, 19]])
>>> X[nearest_index(X, 2, axis=0), np.arange(10)]
array([ 2, 2, 22, 1, 15, 0, 2, 4, 19, 8])
>>> X[np.arange(10), nearest_index(X, 2, axis=1)]
array([ 0, 19, 7, 2, 2, 18, 2, 4, 22, 9])

Slicing numpy ayarray with padding independent of array dimension

Given an array image which might be a 2D, 3D or 4D, but preferable nD array, I want to extract a contiguous part of the array around a point with a list denoting how I extend in along all axis and pad the array with a pad_value if the extensions is out of the image.
I came up with this:
def extract_patch_around_point(image, loc, extend, pad_value=0):
offsets_low = []
offsets_high = []
for i, x in enumerate(loc):
offset_low = -np.min([x - extend[i], 0])
offsets_low.append(offset_low)
offset_high = np.max([x + extend[i] - image.shape[1] + 1, 0])
offsets_high.append(offset_high)
upper_patch_offsets = []
lower_image_offsets = []
upper_image_offsets = []
for i in range(image.ndim):
upper_patch_offset = 2*extend[i] + 1 - offsets_high[i]
upper_patch_offsets.append(upper_patch_offset)
image_offset_low = loc[i] - extend[i] + offsets_low[i]
image_offset_high = np.min([loc[i] + extend[i] + 1, image.shape[i]])
lower_image_offsets.append(image_offset_low)
upper_image_offsets.append(image_offset_high)
patch = pad_value*np.ones(2*np.array(extend) + 1)
# This is ugly
A = np.ix_(range(offsets_low[0], upper_patch_offsets[0]),
range(offsets_low[1], upper_patch_offsets[1]))
B = np.ix_(range(lower_image_offsets[0], upper_image_offsets[0]),
range(lower_image_offsets[1], upper_image_offsets[1]))
patch[A] = image[B]
return patch
Currently it only works in 2D because of the indexing trick with A, B etc. I do not want to check for the number of dimensions and use a different indexing scheme. How can I make this independent on image.ndim?
Based on my understanding of the requirements, I would suggest a zeros-padded version and then using slice notation to keep it generic on number of dimensions, like so -
def extract_patch_around_point(image, loc, extend, pad_value=0):
extend = np.asarray(extend)
image_ext_shp = image.shape + 2*np.array(extend)
image_ext = np.full(image_ext_shp, pad_value)
insert_idx = [slice(i,-i) for i in extend]
image_ext[insert_idx] = image
region_idx = [slice(i,j) for i,j in zip(loc,extend*2+1+loc)]
return image_ext[region_idx]
Sample runs -
2D case :
In [229]: np.random.seed(1234)
...: image = np.random.randint(11,99,(13,8))
...: loc = (5,3)
...: extend = np.array([2,4])
...:
In [230]: image
Out[230]:
array([[58, 94, 49, 64, 87, 35, 26, 60],
[34, 37, 41, 54, 41, 37, 69, 80],
[91, 84, 58, 61, 87, 48, 45, 49],
[78, 22, 11, 86, 91, 14, 13, 30],
[23, 76, 86, 92, 25, 82, 71, 57],
[39, 92, 98, 24, 23, 80, 42, 95],
[56, 27, 52, 83, 67, 81, 67, 97],
[55, 94, 58, 60, 29, 96, 57, 48],
[49, 18, 78, 16, 58, 58, 26, 45],
[21, 39, 15, 93, 66, 89, 34, 61],
[73, 66, 95, 11, 44, 32, 82, 79],
[92, 63, 75, 96, 52, 12, 25, 14],
[41, 23, 84, 30, 37, 79, 75, 33]])
In [231]: image[loc]
Out[231]: 24
In [232]: out = extract_patch_around_point(image, loc, extend, pad_value=0)
In [233]: out
Out[233]:
array([[ 0, 78, 22, 11, 86, 91, 14, 13, 30],
[ 0, 23, 76, 86, 92, 25, 82, 71, 57],
[ 0, 39, 92, 98, 24, 23, 80, 42, 95], <-- At middle
[ 0, 56, 27, 52, 83, 67, 81, 67, 97],
[ 0, 55, 94, 58, 60, 29, 96, 57, 48]])
^
3D case :
In [234]: np.random.seed(1234)
...: image = np.random.randint(11,99,(13,5,8))
...: loc = (5,2,3)
...: extend = np.array([1,2,4])
...:
In [235]: image[loc]
Out[235]: 82
In [236]: out = extract_patch_around_point(image, loc, extend, pad_value=0)
In [237]: out.shape
Out[237]: (3, 5, 9)
In [238]: out
Out[238]:
array([[[ 0, 23, 87, 19, 58, 98, 36, 32, 33],
[ 0, 56, 30, 52, 58, 47, 50, 28, 50],
[ 0, 70, 93, 48, 98, 49, 19, 65, 28],
[ 0, 52, 58, 30, 54, 55, 46, 53, 31],
[ 0, 37, 34, 13, 76, 38, 89, 79, 71]],
[[ 0, 14, 92, 58, 72, 74, 43, 24, 67],
[ 0, 59, 69, 46, 68, 71, 94, 20, 71],
[ 0, 61, 62, 60, 82, 92, 15, 14, 57], <-- At middle
[ 0, 58, 74, 95, 16, 94, 83, 83, 74],
[ 0, 67, 25, 92, 71, 19, 52, 44, 80]],
[[ 0, 74, 28, 12, 12, 13, 62, 88, 63],
[ 0, 25, 58, 86, 76, 40, 20, 91, 61],
[ 0, 28, 42, 85, 22, 45, 64, 35, 66],
[ 0, 64, 34, 69, 27, 17, 92, 89, 68],
[ 0, 15, 57, 86, 17, 98, 29, 59, 50]]])
^
Here is a simple working example that demonstrates how to iteratively "wittle down" your input matrix to obtain the patch around a point in nDims:
import numpy as np
# Givens. Matrix to be sliced, point around which to slice,
# and the padding around the given point
matrix = np.random.normal(size=[5,5,5])
loc = (3,3,3)
padding = 2
# If one knows the dimensionality, the slice can be obtained easily
ans1 = matrix[loc[0] - padding:loc[0] + 1,
loc[1] - padding:loc[1] + 1,
loc[2] - padding:loc[2] + 1]
# If one does not know the dimensionality, the slice can be
# obtained iteratively
ans2 = matrix
for i in range(matrix.ndim):
# Compute slice for the particular axis
s = slice(loc[i] - padding, loc[i] + 1, 1)
# Move particular axis to front, slice it, then move it back
ans2 = np.moveaxis(np.moveaxis(ans2, i, 0)[s], 0, i)
# Assert the two answers are equal
np.testing.assert_array_equal(ans1, ans2)
This example does not take into account slicing beyond the existing dimensions, but that exception can be easily caught in the loop.

Numpy array of random matrices

I'm new to python/numpy and I need to create an array containing matrices of random numbers.
What I've got so far is this:
for i in xrange(samples):
SPN[] = np.random.random((6,5)) * np.random.randint(0,100)
Which make sense for me as PHP developer but is not working for python. So how do I create a 3 dimensional array to contain this matrices/arrays?
Both np.random.randint and np.random.uniform, like most of the np.random functions, accept a size parameter, so in numpy we'd do it in one step:
>>> SPN = np.random.randint(0, 100, (3, 6, 5))
>>> SPN
array([[[45, 95, 56, 78, 90],
[87, 68, 24, 62, 12],
[11, 26, 75, 57, 12],
[95, 87, 47, 69, 90],
[58, 24, 49, 62, 85],
[38, 5, 57, 63, 16]],
[[61, 67, 73, 23, 34],
[41, 3, 69, 79, 48],
[22, 40, 22, 18, 41],
[86, 23, 58, 38, 69],
[98, 60, 70, 71, 3],
[44, 8, 33, 86, 66]],
[[62, 45, 56, 80, 22],
[27, 95, 55, 87, 22],
[42, 17, 48, 96, 65],
[36, 64, 1, 85, 31],
[10, 13, 15, 7, 92],
[27, 74, 31, 91, 60]]])
>>> SPN.shape
(3, 6, 5)
>>> SPN[0].shape
(6, 5)
.. actually, it looks like you may want np.random.uniform(0, 100, (samples, 6, 5)), because you want the elements to be floating point, not integers. Well, it works the same way. :^)
Note that what you did isn't equivalent to np.random.uniform, because you're choosing an array of values between 0 and 1 and then multiplying all of them by a fixed integer. I'm assuming that wasn't actually what you were trying to do, because it's a little unusual; please comment if that is what you actually wanted.

Categories