Extract sub-array from 2D array using logical indexing - python

Extract sub-array from 2D array using logical indexing - python - python

I am trying to extract a sub-array using logical indexes as,
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]])
a
Out[45]:
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16]])
b = np.array([False, True, False, True])
a[b, b]
Out[49]: array([ 6, 16])
python evaluates the logical indexes in b per element of a. However in matlab you can do something like
>> a = [1 2 3 4; 5 6 7 8; 9 10 11 12; 13 14 15 16]
a =
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
>> b = [2 4]
b =
2 4
>> a(b, b)
ans =
6 8
14 16
how can I achieve the same result in python without doing,
c = a[:, b]
c[b,:]
Out[51]:
array([[ 6, 8],
[14, 16]])

Numpy supports logical indexing, though it is a little different than what you are familiar in MATLAB. To get the results you want you can do the following:
a[b][:,b] # first brackets isolates the rows, second brackets isolate the columns
Out[27]:
array([[ 6, 8],
[14, 16]])
The more "numpy" method will be understood after you will understand what happend in your case.
b = np.array([False, True, False, True]) is similar to b=np.array([1,3]) and will be easier for me to explain. When writing a[[1,3],[1,3]] what happens is that numpy crates a (2,1) shape array, and places a[1,1] in the [0] location and a[3,3] in the second location. To create an output of shape (2,2), the indexing must have the same dimensionality. Therefore, the following will get your result:
a[[[1,1],[3,3]],[[1,3],[1,3]]]
Out[28]:
array([[ 6, 8],
[14, 16]])
Explanation:
The indexing arrays are:
temp_rows = np.array([[1,1],
[3,3]])
temp_cols = np.array([[1,3],
[1,3])
both arrays have dimensions of (2,2) and therefore, numpy will create an output of shape (2,2). Then, it places a[1,1] in location [0,0], a[1,3] in [0,1], a[3,1] in location [1,0] and a[3,3] in location [1,1]. This can be expanded to any shape but for your purposes, you wanted a shape of (2,2)
After figuring this out, you can make things even simpler by utilizing the fact you if you insert a (2,1) array in the 1st dimension and a (1,2) array in the 2nd dimension, numpy will perform the broadcasting, similar to the MATLAB operation. This means that by using:
temp_rows = np.array([[1],[3]])
temp_cols = np.array([1,3])
you can do:
a[[[1],[3]], [1,3])
Out[29]:
array([[ 6, 8],
[14, 16]])

You could use np.ix_ here.
a[np.ix_(b, b)]
# array([[ 6, 8],
# [14, 16]])
Output returned by np.ix_
>>> np.ix_(b, b)
(array([[1],
[3]]),
array([[1, 3]]))

You could make use of a outer product of the b vector. The new dimesion you can obtain from the number of True values using a sum.
import numpy as np
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]])
b = np.array([False, True, False, True])
#
M = np.outer(b, b)
new_dim = b.sum()
new_shape = (new_dim, new_dim)
selection = a[M].reshape(new_shape)
The result looks like
[[ 6 8]
[14 16]]

Related

iterating via np.nditer function for numpy arrays

I'm a bit new to python and wanted to check for some values in my arrays to see if they go above or below a certain value and adjust them afterwards.
For the case of a 2d array with numpy I found this in some part of its manual.
import numpy as np
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
for x in np.nditer(arr[:, ::2]):
print(x)
What's the syntax in python to change that initial value so it doesnt iterate over every value starting from the first but from one I can define such iterate from every 2nd or 3rd as I need to check every 1st, 2nd, 3rd and so on value in my arrays against a different value or is there maybe a better way to do this?

I suspect you need to read some more basic numpy docs.
You created a 2d array:
In [5]: arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
In [6]: arr
Out[6]:
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
You can view it as a 1d array (nditer iterates as flat)
In [7]: arr.ravel()
Out[7]: array([1, 2, 3, 4, 5, 6, 7, 8])
You can use standard slice notation to select everyother - on the flattened:
In [8]: arr.ravel()[::2]
Out[8]: array([1, 3, 5, 7])
or every other column of the original:
In [9]: arr[:,::2]
Out[9]:
array([[1, 3],
[5, 7]])
You can test every value, such as for being odd:
In [10]: arr % 2 == 1
Out[10]:
array([[ True, False, True, False],
[ True, False, True, False]])
and use that array to select those values:
In [11]: arr[arr % 2 == 1]
Out[11]: array([1, 3, 5, 7])
or modify them:
In [12]: arr[arr % 2 == 1] += 10
In [13]: arr
Out[13]:
array([[11, 2, 13, 4],
[15, 6, 17, 8]])
The documentation for nditer tends to overhype it. It's useful in compiled code, but rarely useful in python. If the above whole-array methods don't work, you can iterate directly, with more control and understanding:
In [14]: for row in arr:
...: print(row)
...: for x in row:
...: print(x)
[11 2 13 4]
11
2
13
4
[15 6 17 8]
15
6
17
8

Python realization of Matlab 'Outer product'

I am trying to rewrite the following snippet of Matlab code about outer product of matrices into python code,
function Y = matlab_outer_product(X,x)
A = reshape(X, [size(X) ones(1,ndims(x))]);
B = reshape(x, [ones(1,ndims(X)) size(x)]);
Y = squeeze(bsxfun(#times,A,B));
end
My one-to-one translation of this to python code is as following (considering how the shape of numpy array and matlab matrices are arranged),
def python_outer_product(X, x):
X_shape = list(X.shape)
x_shape = list(x.shape)
A = X.reshape(*list(np.ones(np.ndim(x),dtype=int)),*X_shape)
B = x.reshape(*x_shape,*list(np.ones(np.ndim(X),dtype=int)))
Y = A*B
return Y.squeeze()
Then trying the inputs, for instance,
matlab_outer_product([1,2],[[3,4];[5,6]])
python_out_product(np.array([[1,2]], np.array([[3,4],[5,6]])))
The outputs don't quite match. In matlab, it outputs
output(:,:,1) = [[3,5];[6,10]]
output(:,:,2) = [[4,6];[8,12]]
In python, it outputs
output = array([
[[ 3, 6],
[ 4, 8]],
[[ 5, 10],
[ 6, 12]]
])
They're almost identical, but not quite. I wonder what's wrong with code and how to change the python code to match with matlab output?

In full gory detail (since my MATLAB memory is old):
Octave
>> X = [1,2];
>> x = [[3,4];[5,6]];
>> A = reshape(X, [size(X) ones(1,ndims(x))]);
>> B = reshape(x, [ones(1,ndims(X)) size(x)]);
>> A
A =
1 2
>> B
B =
ans(:,:,1,1) = 3
ans(:,:,2,1) = 5
ans(:,:,1,2) = 4
ans(:,:,2,2) = 6
>> bsxfun(#times,A,B)
ans =
ans(:,:,1,1) =
3 6
ans(:,:,2,1) =
5 10
ans(:,:,1,2) =
4 8
ans(:,:,2,2) =
6 12
>> squeeze(bsxfun(#times,A,B))
ans =
ans(:,:,1) =
3 5
6 10
ans(:,:,2) =
4 6
8 12
You start with a (1,2) and (2,2), expand the second to (1,1,2,2). The bsxfun produces a (1,2,2,2) which is squeezed to (2,2,2).
A is X reshaped to [1 2 1 1], but the two outer size 1 dimensions are squeeze out, resulting in no change.
This MATLAB outter is a bit convoluted, using bsxfun to perform elementwise multiplication of (1,2,1,1) with (1,1,1,2). At least in Octave it's the same as
A.*B
In numpy
In [77]: X
Out[77]: array([[1, 2]]) # (1,2)
In [78]: x
Out[78]:
array([[3, 4], # (2,2)
[5, 6]])
Note that the MATLAB/Octave x when flattened has elements (3,5,4,6), while the numpy ravel is [3,4,5,6].
In numpy I can simply do:
In [79]: X[:,:,None,None]*x
Out[79]:
array([[[[ 3, 4], (1,2,2,2)
[ 5, 6]],
[[ 6, 8],
[10, 12]]]])
or without the extra size 1 dimension of X:
In [84]: (X[0,:,None,None]*x)
Out[84]:
array([[[ 3, 4],
[ 5, 6]],
[[ 6, 8],
[10, 12]]])
In [85]: (X[0,:,None,None]*x).ravel()
Out[85]: array([ 3, 4, 5, 6, 6, 8, 10, 12])
compare that with the Octave ravel
>> squeeze(bsxfun(#times,A,B))(:)'
ans =
3 6 5 10 4 8 6 12
We could add a transpose to the numpy
In [96]: (X[0,:,None,None]*x).transpose(2,1,0).ravel()
Out[96]: array([ 3, 6, 5, 10, 4, 8, 6, 12])
In [97]: (X[0,:,None,None]*x).transpose(2,1,0)
Out[97]:
array([[[ 3, 6],
[ 5, 10]],
[[ 4, 8],
[ 6, 12]]])
At least in numpy we can tweak the dimension order in lots of ways, so I won't try to suggest an optimal. I still think it's better to write code that's "natural" to numpy than to slavishly match the MATLAB order.
another try
I realized, above, that the MATLAB is just doing A*.B with
(1,2,1,1) arrays (1,1,1,2), where the extra 1's were added to "broadcast".
Using transpose to the same dimension outermost (leading in numpy)
In [5]: X = X.T; x = x.T
In [6]: X.shape
Out[6]: (2, 1)
In [7]: x.shape
Out[7]: (2, 2)
In [8]: x
Out[8]:
array([[3, 5],
[4, 6]])
In [9]: x.ravel()
Out[9]: array([3, 5, 4, 6]) # compare with MATLAB (:)'
Elementwise multiplication with the same dimension expansion:
In [10]: X[None,None,:,:]*x[:,:,None,None]
Out[10]:
array([[[[ 3],
[ 6]],
[[ 5],
[10]]],
[[[ 4],
[ 8]],
[[ 6],
[12]]]])
In [11]: _.shape
Out[11]: (2, 2, 2, 1) # compare with octave (1,2,2,2)
In [12]: __.squeeze()
Out[12]:
array([[[ 3, 6],
[ 5, 10]],
[[ 4, 8],
[ 6, 12]]])
the ravel is the same as Octave:
In [13]: ___.ravel()
Out[13]: array([ 3, 6, 5, 10, 4, 8, 6, 12])
expand_dims can be used instead of the indexing. Internally it uses reshape:
In [15]: np.expand_dims(X,(0,1)).shape
Out[15]: (1, 1, 2, 1)
In [16]: np.expand_dims(x,(2,3)).shape
Out[16]: (2, 2, 1, 1)

Concatenate NumPy 2D array with column (1D array)

Suppose I have a 2D NumPy array values. I want to add new column to it. New column should be values[:, 19] but lagged by one sample (first element equals to zero). It could be returned as np.append([0], values[0:-2:1, 19]). I tried: Numpy concatenate 2D arrays with 1D array
temp = np.append([0], [values[1:-2:1, 19]])
values = np.append(dataset.values, temp[:, None], axis=1)
but I get:
ValueError: all the input array dimensions except for the concatenation axis
must match exactly
I tried using c_ too as:
temp = np.append([0], [values[1:-2:1, 19]])
values = np.c_[values, temp]
but effect is the same. How this concatenation could be made. I think problem is in temp orientation - it is treated as a row instead of column, so there is an issue with dimensions. In Octave ' (transpose operator) would do the trick. Maybe there is similiar solution in NumPy?
Anyway, thank you for you time.
Best regards,
Max

In [76]: values = np.arange(16).reshape(4,4)
In [77]: temp = np.concatenate(([0], values[1:,-1]))
In [78]: values
Out[78]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [79]: temp
Out[79]: array([ 0, 7, 11, 15])
This use of concatenate to make temp is similar to your use of append (which actually uses concatenate).
Sounds like you want to join values and temp in this way:
In [80]: np.concatenate((values, temp[:,None]),axis=1)
Out[80]:
array([[ 0, 1, 2, 3, 0],
[ 4, 5, 6, 7, 7],
[ 8, 9, 10, 11, 11],
[12, 13, 14, 15, 15]])
Again I prefer using concatenate directly.

You need to convert the 1D array to 2D as shown. You can then use vstack or hstack with reshaping to get the final array you want as shown:
a = np.array([[1, 2, 3],[4, 5, 6]])
b = np.array([[7, 8, 9]])
c = np.vstack([ele for ele in [a, b]])
print(c)
c = np.hstack([a.reshape(1,-1) for a in [a,b]]).reshape(-1,3)
print(c)
Either way, the output is:
[[1 2 3] [4 5 6] [7 8 9]]
Hope I understood the question correctly

Clean array indexing with arrays in numpy

I'm running into this problem over and over again, and can't seem to find a clean solution for this. So I'm trying to index an array with another array. I have a 2d numpy array. And a 1d numpy array with the same length as the 1st dimension of the 2d array I'm trying to index and the elements represent the indices of the columns I try to extract:
import numpy as np
A = np.random.rand((5,3))
B = np.asarray([2,1,2,0,1])
The behaviour that I want is extracting for all rows the corresponding column in array B. This could be done by
C = A[np.arange(A.shape[0]),B]
But I can imagine that there is a better way to get this behaviour. Using a : as indexing the first row gives the wrong behaviour.
If there is a cleaner way of doing this that would be great. I'm really used to this array indexing from Matlab, but maybe there is no equivalent in numpy. Using boolean indices is of course an option, but that also requires converting arrays all the time.

I think what you look after is np.choose(B,A.T) :
In [125]: A
Out[125]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]])
In [126]: B = np.asarray([2,1,2,0,1])
In [127]: np.choose(B,A.T)
Out[127]: array([ 2, 4, 8, 9, 13])

In [60]: A = np.arange(1,16).reshape(5,3)
In [61]: B = np.array([2,1,2,0,1])
In [62]: C = A[np.arange(A.shape[0]),B]
In [63]: C
Out[63]: array([ 3, 5, 9, 10, 14])
In Octave
>> A = reshape(1:15, 3,5).';
>> B = [3,2,3,1,2];
>> A
A =
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
>> A(:,B)
ans =
3 2 3 1 2
6 5 6 4 5
9 8 9 7 8
12 11 12 10 11
15 14 15 13 14
This is the same as numpy:
In [65]: A[:,B]
Out[65]:
array([[ 3, 2, 3, 1, 2],
[ 6, 5, 6, 4, 5],
[ 9, 8, 9, 7, 8],
[12, 11, 12, 10, 11],
[15, 14, 15, 13, 14]])
You imply that there's something clean for indexing one item from each column in MATLAB/Octave, but I missing it. I used to work a lot in that language, but I've gotten out of practice.
sub2ind does the job:
>> sub2ind([3,5],B, 1:5)
ans =
3 5 9 10 14
>> A.'(:)(sub2ind([3,5],B,1:5))
ans =
3
5
9
10
14
(The 'F' vs 'C' ordering is complicating my comparison)
numpy has a similar ravel_multi_index:
In [69]: np.ravel_multi_index((np.arange(5),B),(5,3))
Out[69]: array([ 2, 4, 8, 9, 13], dtype=int32)
In [71]: A.flat[_]
Out[71]: array([ 3, 5, 9, 10, 14])

Multiplying NumPy arrays by scalars

I have a NumPy array of shape (2,76020,2). Basically it is made of two columns containing 76020 rows each, and each row has two entries.
I want to multiply each column by a different weight, say column 1 by 3 and column 2 by 5. For example:
m =
[3,4][5,8]
[1,2][2,2]
a = [3,5]
I want:
[9,12][25,40]
[3,6][10,10]
I thought I could just multiply m*a, but that gives me instead:
[9,20][15,40]
[3,10][6,10]
How can I write this multiplication?

It's a problem of broadcasting: you must align the dimensions to multiply, here the second:
m = array(
[[[3,4],[5,8]],
[[1,2],[2,2]]])
a = array([3,5])
print(a[None,:,None].shape, m*a[None,:,None])
"""
(1, 2, 1)
[[[ 9 12]
[25 40]]
[[ 3 6]
[10 10]]]
"""

As #B.M. says, this is a 'array broadcasting' issue. (The idea behind his answer is correct, but I think his and the OP's dimensions aren't matching up correctly.)
>>> m = np.array([[[3,4],[5,8]],[[1,2],[2,2]]])
>>> print(m)
[[[3 4]
[5 8]]
[[1 2]
[2 2]]]
>>> print(m.shape)
(2, 2, 2)
>>> a = np.array([3,5])
>>> print(a.shape)
(2,)
We need the shapes of m and a to match, so we have to 'broadcast' a to the correct shape:
>>> print(a[:, np.newaxis, np.newaxis].shape)
(2, 1, 1)
>>> b = a[:, np.newaxis, np.newaxis] * m
>>> print(b)
[[[ 9 12]
[15 24]]
[[ 5 10]
[10 10]]]
In this way the first dimension of a is preserved, and maps to each element of the first dimension of m. But there are also two new dimensions ('axes') created to 'broadcast' into the other two dimensions of m.
Note: np.newaxis is (literally) None, they have the same effect. The former is more readable to understand what's happening. Additionally, just in terms of standard terminology, the first dimension (axis) is generally referred to as the 'rows', and the second axis the 'columns'.

Your description is ambiguous
Basically it is made of two columns containing 76020 rows each, and each row has two entries.
In (2,76020,2), which 2 is columns, and which is entries?
I believe your m is (that display is also ambiguous)
In [8]: m
Out[8]:
array([[[3, 4],
[5, 8]],
[[1, 2],
[2, 2]]])
In [9]: m*a
Out[9]:
array([[[ 9, 20],
[15, 40]],
[[ 3, 10],
[ 6, 10]]])
That's the same as m*a[None,None,:]. When broadcasting, numpy automatically adds dimensions at the beginning as needed. Or iteratively:
In [6]: m[:,:,0]*3
Out[6]:
array([[ 9, 15],
[ 3, 6]])
In [7]: m[:,:,1]*5
Out[7]:
array([[20, 40],
[10, 10]])
Since m is (2,2,2) shape, we can't off hand tell which axis a is supposed to multiply.
According to the accepted answer, you want to multiply along the middle axis
In [16]: m*a[None,:,None]
Out[16]:
array([[[ 9, 12],
[25, 40]],
[[ 3, 6],
[10, 10]]])
But what if m was (2,3,2) in shape? a would then have to have 3 values
In [17]: m=np.array([[[3,4],[5,8],[0,0]],[[1,2],[2,2],[1,1]]])
In [18]: m*a[None,:,None]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-18-f631c33646b7> in <module>()
----> 1 m*a[None,:,None]
ValueError: operands could not be broadcast together with shapes (2,3,2) (1,2,1)
The alternative broadcastings work
In [19]: m*a[:,None,None]
Out[19]:
array([[[ 9, 12],
[15, 24],
[ 0, 0]],
[[ 5, 10],
[10, 10],
[ 5, 5]]])
In [20]: m*a[None,None,:]
Out[20]:
array([[[ 9, 20],
[15, 40],
[ 0, 0]],
[[ 3, 10],
[ 6, 10],
[ 3, 5]]])
Now if m had distinct dimensions, e.g. (3,1000,2), we could tell at a glance with axis a 2 element weight array would work with.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Extract sub-array from 2D array using logical indexing - python - python

You could use np.ix_ here. a[np.ix_(b, b)] # array([[ 6, 8], # [14, 16]]) Output returned by np.ix_ >>> np.ix_(b, b) (array([[1], [3]]), array([[1, 3]]))

Related

iterating via np.nditer function for numpy arrays

Python realization of Matlab 'Outer product'

Concatenate NumPy 2D array with column (1D array)

Clean array indexing with arrays in numpy

Multiplying NumPy arrays by scalars

Categories

Resources