I'm having some trouble understanding the rules for array broadcasting in Numpy.
Obviously, if you perform element-wise multiplication on two arrays of the same dimensions and shape, everything is fine. Also, if you multiply a multi-dimensional array by a scalar it works. This I understand.
But if you have two N-dimensional arrays of different shapes, it's unclear to me exactly what the broadcasting rules are. This documentation/tutorial explains that: In order to broadcast, the size of the trailing axes for both arrays in an operation must either be the same size or one of them must be one.
Okay, so I assume by trailing axis they are referring to the N in a M x N array. So, that means if I attempt to multiply two 2D arrays (matrices) with equal number of columns, it should work? Except it doesn't...
>>> from numpy import *
>>> A = array([[1,2],[3,4]])
>>> B = array([[2,3],[4,6],[6,9],[8,12]])
>>> print(A)
[[1 2]
[3 4]]
>>> print(B)
[[ 2 3]
[ 4 6]
[ 6 9]
[ 8 12]]
>>>
>>> A * B
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: shape mismatch: objects cannot be broadcast to a single shape
Since both A and B have two columns, I would have thought this would work. So, I'm probably misunderstanding something here about the term "trailing axis", and how it applies to N-dimensional arrays.
Can someone explain why my example doesn't work, and what is meant by "trailing axis"?
Well, the meaning of trailing axes is explained on the linked documentation page.
If you have two arrays with different dimensions number, say one 1x2x3 and other 2x3, then you compare only the trailing common dimensions, in this case 2x3. But if both your arrays are two-dimensional, then their corresponding sizes have to be either equal or one of them has to be 1. Dimensions along which the array has size 1 are called singular, and the array can be broadcasted along them.
In your case you have a 2x2 and 4x2 and 4 != 2 and neither 4 or 2 equals 1, so this doesn't work.
From http://cs231n.github.io/python-numpy-tutorial/#numpy-broadcasting:
Broadcasting two arrays together follows these rules:
If the arrays do not have the same rank, prepend the shape of the lower rank array with 1s until both shapes have the same length.
The two arrays are said to be compatible in a dimension if they have the same size in the dimension, or if one of the arrays has size 1 in that dimension.
The arrays can be broadcast together if they are compatible in all dimensions.
After broadcasting, each array behaves as if it had shape equal to the elementwise maximum of shapes of the two input arrays.
In any dimension where one array had size 1 and the other array had size greater than 1, the first array behaves as if it were copied along that dimension
If this explanation does not make sense, try reading the explanation from the documentation or this explanation.
we should consider two points about broadcasting. first: what is possible. second: how much of the possible things is done by numpy.
I know it might look a bit confusing, but I will make it clear by some example.
lets start from the zero level.
suppose we have two matrices. first matrix has three dimensions (named A) and the second has five (named B). numpy tries to match last/trailing dimensions. so numpy does not care about the first two dimensions of B. then numpy compares those trailing dimensions with each other. and if and only if they be equal or one of them be 1, numpy says "O.K. you two match". and if it these conditions don't satisfy, numpy would "sorry...its not my job!".
But I know that you may say comparison was better to be done in way that can handle when they are devisable(4 and 2 / 9 and 3). you might say it could be replicated/broadcasted by a whole number(2/3 in out example). and i am agree with you. and this is the reason I started my discussion with a distinction between what is possible and what is the capability of numpy.
Related
I'm doing a lot of vector algebra and want to use numpy arrays to remove any need for loops and run faster.
What I've found is that if I have a matrix A of size [N,P] I constantly need to use np.array([A[:,0]).T to force A[:,0] to be a column vector of size (N,1)
Is there a way to keep the single row or column of a 2D array as a 2D array because it makes the following arithmetic sooo much easier. For example, I often have to multiply a column vector (from a matrix) with a row vector (also created from a matrix) to create a new matrix: eg
C = A[:,i] * B[j,:]
it'd be be great if I didn't have to keep using:
C = np.array([A[:,i]]).T * np.array([B[j,:]])
It really obfuscates the code - in MATLAB it'd simply be C = A[:,i] * B[j,:] which is easier to read and compare with the underlying mathematics, especially if there's a lot of terms like this in the same line, but unfortunately most of my colleagues don't have MATLAB licenses.
Note this isn't the only use case, so a specific function for this column x row operation isn't too helpful
Even MATLAB/Octave squeezes out excess dimensions:
>> ones(2,3,4)(:,:,1)
ans =
1 1 1
1 1 1
>> size(ones(2,3,4)(1,:)) # some indexing "flattens" outer dims
ans =
1 12
When I started MATLAB v3.5 2d matrix was all it had; cells, struct and higher dimensions were later additions (as demonstrated by the above examples).
Your:
In [760]: A=np.arange(6).reshape(2,3)
In [762]: np.array([A[:,0]]).T
Out[762]:
array([[0],
[3]])
is more convoluted than needed. It makes a list, then a (1,N) array from that, and finally a (N,1)
A[:,[0]], A[:,:,None], A[:,0:1] are more direct. Even A[:,0].reshape(-1,1)
I can't think of something simple that treats a scalar and list index the same.
Functions like np.atleast_2d can conditionally add a new dimension, but it will be a leading (outer) one. But by the rules of broadcasting leading dimensions are usually 'automatic'.
basic v advanced indexing
In the underlying Python, scalars can't be indexed, and lists can only be indexed with scalars and slices. The underlying syntax allows indexing with tuples, but lists reject those. It's numpy that has extended the indexing considerably - not with syntax but with how it handles those tuples.
numpy indexing with slices and scalars is basic indexing. That's where the dimension loss can occur. That's consistent with list indexing
In [768]: [[1,2,3],[4,5,6]][1]
Out[768]: [4, 5, 6]
In [769]: np.array([[1,2,3],[4,5,6]])[1]
Out[769]: array([4, 5, 6])
Indexing with lists and arrays is advanced indexing, without any list counterparts. This is perhaps where the differences between MATLAB and numpy are ugliest :)
>> A([1,2],[1,2])
produces a (2,2) block. In numpy that produces a "diagonal"
In [781]: A[[0,1],[0,1]]
Out[781]: array([0, 4])
To get the block we have to use lists (or arrays) that "broadcast" against each other:
In [782]: A[[[0],[1]],[0,1]]
Out[782]:
array([[0, 1],
[3, 4]])
To get the "diagonal" in MATLAB we have to use sub2ind([2,2],[1,2],[1,2]) to get the [1,4] flat indices.
What kind of multiplication?
In
np.array([A[:,i]]).T * np.array([B[j,:]])
is this elementwise (.*) or matrix?
For a (N,1) and (1,M) pair, A*B and A#B produce the same (N,M) result, but one uses broadcasting to generalize the outer product, and the other is inner/matrix product (with sum-of-products).
https://numpy.org/doc/stable/reference/generated/numpy.matrix.html
Returns a matrix from an array-like object, or from a string of data. A matrix is a specialized 2-D array that retains its 2-D nature through operations. It has certain special operators, such as * (matrix multiplication) and ** (matrix power).
I'm not sure how to re-implement it though, it's an interesting exercise.
As mentionned, matrix will be deprecated. But from np.array, you can specify the dimension with the argument ndim=2:
np.array([1, 2, 3], ndmin=2)
You can keep the dimension in the following way (using # for matrix multiplication)
C = A[:,[i]] # B[[j],:]
Notice the brackets around i and j, otherwise C won't be a 2-dimensional matrix.
I have a 5 dimension array like this
a=np.random.randint(10,size=[2,3,4,5,600])
a.shape #(2,3,4,5,600)
I want to get the first element of the 2nd dimension, and several elements of the last dimension
b=a[:,0,:,:,[1,3,5,30,17,24,30,100,120]]
b.shape #(9,2,4,5)
as you can see, the last dimension was automatically converted to the first dimension.
why? and how to avoid that?
This behavior is described in the numpy documentation. In the expression
a[:,0,:,:,[1,3,5,30,17,24,30,100,120]]
both 0 and [1,3,5,30,17,24,30,100,120] are advanced indexes, separated by slices. As the documentation explains, in such case dimensions coming from advanced indexes will be first in the resulting array.
If we replace 0 by the slice 0:1 it will change this situation (since it will leave only one advanced index), and then the order of dimensions will be preserved. Thus one way to fix this issue is to use the 0:1 slice and then squeeze the appropriate axis:
a[:,0:1,:,:,[1,3,5,30,17,24,30,100,120]].squeeze(axis=1)
Alternatively, one can keep both advanced indexes, and then rearrange axes:
np.moveaxis(a[:,0,:,:,[1,3,5,30,17,24,30,100,120]], 0, -1)
Why can't I get the transpose when of alpha but I can get it for beta? What do the additional [] do?
alpha = np.array([1,2,3,4])
alpha.shape
alpha.T.shape
beta = np.array([[1,2,3,4]])
beta.shape
beta.T.shape
From the documention (link):
Transposing a 1-D array returns an unchanged view of the original array.
The array [1,2,3,4] is 1-D while the array [[1,2,3,4]] is a 1x4 2-D array.
The second pair of bracket indicates that it is a 2D array, so with such and array the transposed array is different from the first array (since the transpose switches the 2 dimensions). However if the array is only 1D the transpose doesn't change anything and the resulting array is equal to the starting one.
alpha is a 1D array, the transpose is itself.
beta is a 2D array, so you can transform (1,n) to (n,1).
To do the same with alpha, you need to add a dimension, you don't need to transpose it:
alpha[:, None]
alpha is a 1D array with shape (4,). The transpose is just alpha again, i.e. alpha == alpha.T.
beta is a 2D array with shape (1,4). It's a single row, but it has two dimensions. Its transpose looks like a single column with shape (4,1).
When I arrived at the programming language world, having come from the "math side of the business" this also seemed strange to me. After giving some thought to it I realized that from a programming perspective they are different. Have a look at the following list:
a = [1,2,3,4,5]
This is a 1D structure. This is so, because to get back the values 1,2,3,4 and 5 you just need to assign one address value. 3 would be returned if you issued the command a[2] for instance.
Now take a look at this list:
b = [[ 1, 2, 3, 4, 5],
[11, 22, 33, 44, 55]]
To get back the 11 for instance you would need two positional numbers, 1 because 11 is located in the 2nd list and 0 because in the second list it is located in the first position. In other words b[1,0] gives back to you 11.
Now comes the trick part. Look at this third list:
c = [ [ 100, 200, 300, 400, 500] ]
If you look carefully each number requires 2 positional numbers to be taken back from the list. 300 for instance requires 0 because it is located in the first (and only) list and 2 because it is the third element of the first list. c[0,2] gets you back 300.
This list can be transposed because it has two dimensions and the transposition operation is something that switches the positional arguments. So c.T would give you back a list whose shape would be [5,1], since c has a [1,5] shape.
Get back to list a. There you have a list with only one positional number. That list has a shape of [5] only, so thereĀ“s no second positional argument to the transposition operation to work with. Therefore it remains [5] and if you try a.T you get back a.
Got it?
Best regards,
Gustavo,
This question already has an answer here:
Logical indexing in Numpy with two indices as in MATLAB
(1 answer)
Closed 7 years ago.
I am trying to write some code that uses logical numpy arrays to index a larger array, similar to how MATLAB allows array indexing with logical arrays.
import numpy as np
m = 4
n = 4
unCov = np.random.randint(10, size = (m,n) )
rowCov = np.zeros( m, dtype = bool )
colCov = np.ones( n, dtype = bool )
>>> unCov[rowCov, rowCov]
[] # as expected
>>> unCov[colCov, colCov]
[0 8 3 3] # diagonal values of unCov, as expected
>>> unCov[rowCov, colCov]
ValueError: shape mismatch: objects cannot be broadcast to a single shape
For this last evaluation, I expected an empty array, similar to what MATLAB returns. I'd rather not have to check rowCov/colCov for True elements prior to indexing. Why is this happening, and is there a better way to do this?
As I understand it, numpy will translate your 2d logical indices to actual index vectors: arr[[True,False],[False,True]] would become arr[0,1] for an ndarray of shape (2,2). However, in your last case the second index array is full False, hence it corresponds to an index array of length 0. This is paired with the other full True index vector, corresponding to an index array of length 4.
From the numpy manual:
If the index arrays do not have the same shape, there is an attempt to
broadcast them to the same shape. If they cannot be broadcast to the
same shape, an exception is raised:
In your case, the error is exactly due to this:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-1411-28e41e233472> in <module>()
----> 1 unCov[colCov,rowCov]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (4,) (0,)
MATLAB, on the other hand, automatically returns an empty array if the index array is empty along any given dimension.
This actually highlights a fundamental difference between the logical indexing in MATLAB and numpy. In MATLAB, vectors in subscript indexing always slice out a subarray. That is, both
arr([1,2],[1,2])
and
arr([true,true],[true,true])
will return the 2 x 2 submatrix of the matrix arr. If the logical index vectors are shorter than the given dimension of the array, the missing indexing elements are assumed to be false. Fun fact: the index vector can also be longer than the given dimension, as long as the excess elements are all false. So the above is also equivalent to
arr([true,true,false,false],[true,true])
and
arr([true,true,false,false,false,false,false],[true,true])
for a 4 x 4 array (for the sake of argument).
In numpy, however, indexing with boolean-valued numpy arrays in this way will try to extract a vector. Furthermore, the boolean index vectors should be the same length as the dimension they are indexing into. In your 4 x 4 example,
unCov[np.array([True,True]),np.array([True,True])]
and
unCov[np.array([True,True,False,False,False]),np.array([True,True,False,False,False])]
both return the two first diagonal elements, so not a submatrix but rather a vector. Furthermore, they also give the less-then-encouraging warning along the lines of
/usr/bin/ipython:1: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 4 but corresponding boolean dimension is 5
So, in numpy, your logical indexing vectors should be the same length as the corresponding dimensions of the ndarray. And then what I wrote above holds true: the logical values are translated into indices, and the result is expected to be a vector. The length of this vector is the number of True elements in every index vector, so if your boolean index vectors have a different number of True elements, then the referencing doesn't make sense, and you get the error that you get.
numpy provides three handy routines to turn an array into at least a 1D, 2D, or 3D array, e.g. through numpy.atleast_3d
I need the equivalent for one more dimension: atleast_4d. I can think of various ways using nested if statements but I was wondering whether there is a more efficient and faster method of returning the array in question. In you answer, I would be interested to see an estimate (O(n)) of the speed of execution if you can.
The np.array method has an optional ndmin keyword argument that:
Specifies the minimum number of dimensions that the resulting array
should have. Ones will be pre-pended to the shape as needed to meet
this requirement.
If you also set copy=False you should get close to what you are after.
As a do-it-yourself alternative, if you want extra dimensions trailing rather than leading:
arr.shape += (1,) * (4 - arr.ndim)
Why couldn't it just be something as simple as this:
import numpy as np
def atleast_4d(x):
if x.ndim < 4:
y = np.expand_dims(np.atleast_3d(x), axis=3)
else:
y = x
return y
ie. if the number of dimensions is less than four, call atleast_3d and append an extra dimension on the end, otherwise just return the array unchanged.