I'm trying to figure out a good way of doing the following addition operation without using np.repeat to create a large dimension. If using np.repeat and adding is the best solution let me know.
I'm also confused about what broadcasting is doing in this case. Essentially I have a 4d matrix, and I want to add a 2d matrix in the 1st and 2nd index, while and doing this across index 0 and index 3.
This works correctly
a = np.arange(64).reshape((2,4,4,2)).astype(float)
b = np.ones((2,2))
a[:, 0:2, 0:2, : ] += b
This throws an error. What is a good way of doing this?
a[:, 0:3, 0:3, :] += np.ones((3,3))
This works but is not what I'm looking to do
c = np.arange(144).reshape(3,4,4,3).astype(float)
c[:, 0:3, 0:3, :] += np.ones((3,3))
You could include an empty axis from the start:
a[:, 0:3, 0:3, :] += np.ones((3,3,1)) # 1 broadcasts against any axis
Similar you should have used:
a[:, 0:2, 0:2, : ] += np.ones((2,2,1))
because you (probably inadvertently) broadcasted these against the third and fourth axes. I think you wanted it to broadcast to the second and third, right?
Also you can always add dimensions with np.expand_dims and axis=-1:
>>> np.expand_dims(np.ones((2, 2)), axis=-1).shape
(2, 2, 1)
or slicing with None or np.newaxis (they are equivalent!):
>>> np.ones((2, 2))[None, :, :, np.newaxis].shape
(1, 2, 2, 1)
The first None is not necessary for correct broadcasting but the last one is!
In this context it is important to mention that numpy broadcasts starting with the last dimension. So if you have two arrays each dimension starting by the last one must have equal shape or one of them has to be 1 (if one is 1 then it broadcasts along this axis!). That's why a[:, 0:2, 0:2, : ] worked:
>>> a[:, 0:2, 0:2, : ].shape
(2, 2, 2, 2)
>>> b.shape
(2, 2)
So the last dimension is equal (both 2) and the second-last one is equal (both 2). However with:
>>> np.ones((2,2,1)).shape
(2, 2, 1)
The last one is 2 and 1 so the last axis of np.ones((2,2,1)) is broadcast while the second and third dimension are equal (all 2) so numpy uses element-wise operations there.
To align the axes of the array to be added, we need to insert a new axis at the end, like so -
a[:, 0:3, 0:3, :] += np.ones((3,3))[...,None]
Let's study the shapes here :
In [356]: a[:, 0:3, 0:3, :].shape
Out[356]: (2, 3, 3, 2)
In [357]: np.ones((3,3)).shape
Out[357]: (3, 3)
In [358]: np.ones((3,3))[...,None].shape
Out[358]: (3, 3, 1)
Input1 (a[:, 0:3, 0:3, :]) : (2, 3, 3, 2)
Input2 (np.ones((3,3))[...,None]) : (3, 3, 1)
Remember that the broadcasting rules state that the singleton dimensions (dimensions with lengths = 1) would broadcast to the match up with the lengths of the other non-singleton dimensions. Also, the dimensions that are not listed actually have lengths of 1 by default.
So, this is broadcastable and would work now.
Part 2: Why the following works?
c = np.arange(144).reshape(3,4,4,3).astype(float)
c[:, 0:3, 0:3, :] += np.ones((3,3))
Studying shapes again -
In [363]: c[:, 0:3, 0:3, :].shape
Out[363]: (3, 3, 3, 3)
In [364]: np.ones((3,3)).shape
Out[364]: (3, 3)
Input1 (c[:, 0:3, 0:3, :]) : (3, 3, 3, 3)
Input2 (np.ones((3,3))) : (3, 3)
Again going by the broadcastable rules this is fine, so no error here, but the result isn't the expected one.
Related
I wish to compute the dot product between two 3D tensors along the first dimension. I tried the following einsum notation:
import numpy as np
a = np.random.randn(30).reshape(3, 5, 2)
b = np.random.randn(30).reshape(3, 2, 5)
# Expecting shape: (3, 5, 5)
np.einsum("ijk,ikj->ijj", a, b)
Sadly it returns this error:
ValueError: einstein sum subscripts string includes output subscript 'j' multiple times
I went with Einstein sum after I failed at it with np.tensordot. Ideas and follow up questions are highly welcome!
Your two dimensions of size 5 and 5 do not correspond to the same axes. As such you need to use two different subscripts to designate them. For example, you can do:
>>> res = np.einsum('ijk,ilm->ijm', a, b)
>>> res.shape
(3, 5, 5)
Notice you are also required to change the subscript for axes of size 2 and 2. This is because you are computing the batched outer product (i.e. we iterate on two axes at the same time), not a dot product (i.e. we iterate simultaneously on the two axes).
Outer product:
>>> np.einsum('ijk,ilm->ijm', a, b)
Dot product over subscript k, which is axis=2 of a and axis=1 of b:
>>> np.einsum('ijk,ikm->ijm', a, b)
which is equivalent to a#b.
dot product ... along the first dimension is a bit unclear. Is the first dimension a 'batch' dimension, with 3 dot's on the rest? Or something else?
In [103]: a = np.random.randn(30).reshape(3, 5, 2)
...: b = np.random.randn(30).reshape(3, 2, 5)
In [104]: (a#b).shape
Out[104]: (3, 5, 5)
In [105]: np.einsum('ijk,ikl->ijl',a,b).shape
Out[105]: (3, 5, 5)
#Ivan's answer is different:
In [106]: np.einsum('ijk,ilm->ijm', a, b).shape
Out[106]: (3, 5, 5)
In [107]: np.allclose(np.einsum('ijk,ilm->ijm', a, b), a#b)
Out[107]: False
In [108]: np.allclose(np.einsum('ijk,ikl->ijl', a, b), a#b)
Out[108]: True
Ivan's sums the k dimension of one, and l of the other, and then does a broadcasted elementwise. That is not matrix multiplication:
In [109]: (a.sum(axis=-1,keepdims=True)* b.sum(axis=1,keepdims=True)).shape
Out[109]: (3, 5, 5)
In [110]: np.allclose((a.sum(axis=-1,keepdims=True)* b.sum(axis=1,keepdims=True)),np.einsum('ijk,ilm->ijm', a,
...: b))
Out[110]: True
Another test of the batch processing:
In [112]: res=np.zeros((3,5,5))
...: for i in range(3):
...: res[i] = a[i]#b[i]
...: np.allclose(res, a#b)
Out[112]: True
Can someone explain in steps how numpy broadcasting works in this case?
a = np.ones((2,3))
b = np.ones((2,1,3))
c = a-b
a.shape
(2, 3)
b.shape
(2, 1, 3)
c.shape
(2, 2, 3)
Referring to this page, it says that numpy prepends the tensor with lower rank with 1s, so in this case we have
a.shape = [1,2,3]
Tile a along axis 1 to get a.shape=[2,2,3]
tile b along axis 2 to get b.shape=[2,2,3]
When the dimensions are same, subtract
Prepend 1 to a.shape, so a.shape -> (1,2,3)
Stretch a along dim 1 to match b. so a.shape -> (2,2,3)
Stretch b along dim 2 to match a. so b.shape -> (2,2,3)
Subtract
Is that what you're looking for?
I have a one dimesional array of scalar values
Y = np.array([1, 2])
I also have a 3-dimensional array:
X = np.random.randint(0, 255, size=(2, 2, 3))
I am attempting to subtract each value of Y from X, so I should get back Z which should be of shape (2, 2, 2, 3) or maybe (2, 2, 2, 3).
I can"t seem to figure out how to do this via broadcasting.
I tried changing the change of Y:
Y = np.array([[[1, 2]]])
but not sure what the correct shape should be.
Broadcasting lines up dimensions on the right. So you're looking to operate on a (2, 1, 1, 1) array and a (2, 2, 3) array.
The simplest way I can think of is using reshape:
Y = Y.reshape(-1, 1, 1, 1)
More generally:
Y = Y.reshape(-1, *([1] * X.ndim))
At most one of the arguments to reshape can be -1, indicating all the remaining size not accounted for by other dimensions.
To get Z of shape (2, 2, 2, 3):
Z = X - Y.reshape(-1, *([1] * X.ndim))
If you were OK with having Z of shape (2, 2, 3, 2), the operation would be much simpler:
Z = X[..., None] - Y
None or np.newaxis will insert a unit axis into the end of X's shape, making it broadcast properly with the 1D Y.
I am not entirely sure on which dimension you want your subtraction to take place, but X - Y will not return an error if you define Y such as Y = numpy.array([1,2]).reshape(2, 1, 1) or Y = numpy.array([1,2]).reshape(1, 2, 1).
My question is about Python array shape.
What is the difference between array size (2, ) and (2, 1)?
I tried to add those two arrays together. However, I got an error as follows:
Non-broadcastable output operant with shape (2, ) doesn't match the broadcast shape (2, 2)
There is no difference in the raw memory. But logically, one is a one-dimensional array of two values, the other is a 2D array (where one of the dimensions just happens to be size 1).
The logical distinction is important to numpy; when you try to add them, it wants to make a new 2x2 array where the top row is the sum of the (2, 1) array's top "row" with each value in the (2,) array. If you use += to do that though, you're indicating that you expect to be able to modify the (2,) array in place, which is not possible without resizing (which numpy won't do). If you change your code from:
arr1 += arr2
to:
arr1 = arr1 + arr2
it will happily create a new (2, 2) array. Or if the goal was that the 2x1 array should act like a flat 1D array, you can flatten it:
alreadyflatarray += twodarray.flatten()
(2,) is an unidimensional array, (2,1) is a matrix with only one column
You can easily see the difference by crating arrays full of zeros using np.zero passing the desired shape:
>>> np.zeros((2,))
array([0., 0.])
>>> np.zeros((2,1))
array([[0.],
[0.]])
#yx131, you can have a look at the below code to just have a clear picture about tuples and it's use in defining the shape of numpy arrays.
Note: Do not forget to see the code below as it has explanation of the problems related to Broadcasting in numpy.
Also check numpy's broadcasting rule at
https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html.
There's a difference between (2) and (2,). The first one is a literal value 2 where as the 2nd one is a tuple.
(2,) is 1 item tuple and (2, 2) is 2 items tuple. It is clear in the code example.
Note: In case of numpy arrays, (2,) denotes shape of 1 dimensional array of 2 items and (2, 2) denotes the shape of 2 dimensional array (matrix) with 2 rows and 2 colums. If you want to add 2 arrays then their shape should be same.
v = (2) # Assignment of value 2
t = (2,) # Comma is necessary at the end of item to define 1 item tuple, it is not required in case of list
t2 = (2, 1) # 2 item tuple
t3 = (3, 4) # 2 item tuple
print(v, type(v))
print(t, type(t))
print(t2, type(t2))
print(t3, type(t3))
print(t + t2)
print(t2 + t3)
"""
2 <class 'int'>
(2,) <class 'tuple'>
(2, 1) <class 'tuple'>
(3, 4) <class 'tuple'>
(2, 2, 1)
(2, 1, 3, 4)
"""
Now, let's have a look at the below code to figure out the error related to broadcasting. It's all related to dimensions.
# Python 3.5.2
import numpy as np
arr1 = np.array([1, 4]);
arr2 = np.array([7, 6, 3, 8]);
arr3 = np.array([3, 6, 2, 1]);
print(arr1, ':', arr1.shape)
print(arr2, ":", arr2.shape)
print(arr3, ":", arr3.shape)
print ("\n")
"""
[1 4] : (2,)
[7 6 3 8] : (4,)
[3 6 2 1] : (4,)
"""
# Changing shapes (dimensions)
arr1.shape = (2, 1)
arr2.shape = (2, 2)
arr3.shape = (2, 2)
print(arr1, ':', arr1.shape)
print(arr2, ":", arr2.shape)
print(arr3, ":", arr3.shape)
print("\n")
print(arr1 + arr2)
"""
[[1]
[4]] : (2, 1)
[[7 6]
[3 8]] : (2, 2)
[[3 6]
[2 1]] : (2, 2)
[[ 8 7]
[ 7 12]]
"""
arr1.shape = (2, )
print(arr1, arr1.shape)
print(arr1 + arr2)
"""
[1 4] (2,)
[[ 8 10]
[ 4 12]]
"""
# Code with error(Broadcasting related)
arr2.shape = (4,)
print(arr1+arr2)
"""
Traceback (most recent call last):
File "source_file.py", line 53, in <module>
print(arr1+arr2)
ValueError: operands could not be broadcast together
with shapes (2,) (4,)
"""
So in your case, the problem is related to the mismatched dimensions (acc. to numpy's broadcasting ) to be added. Thanks.
Make an array that has shape (2,)
In [164]: a = np.array([3,6])
In [165]: a
Out[165]: array([3, 6])
In [166]: a.shape
Out[166]: (2,)
In [167]: a.reshape(2,1)
Out[167]:
array([[3],
[6]])
In [168]: a.reshape(1,2)
Out[168]: array([[3, 6]])
The first displays like a simple list [3,6]. The second as a list with 2 nested lists. The third as a list with one nested list of 2 items. So there is a consistent relation between shape and list nesting.
In [169]: a + a
Out[169]: array([ 6, 12]) # shape (2,)
In [170]: a + a.reshape(1,2)
Out[170]: array([[ 6, 12]]) # shape (1,2)
In [171]: a + a.reshape(2,1)
Out[171]:
array([[ 6, 9], # shape (2,2)
[ 9, 12]])
Dimensions behave as:
(2,) + (2,) => (2,)
(2,) + (1,2) => (1,2) + (1,2) => (1,2)
(2,) + (2,1) => (1,2) + (2,1) => (2,2) + (2,2) => (2,2)
That is a lower dimensional array can be expanded to the matching number of dimensions with the addition of leading size 1 dimensions.
And size 1 dimensions can be changed to match the corresponding dimension.
I suspect you got the error when doing a a += ... (If so you should have stated that clearly.)
In [172]: a += a
In [173]: a += a.reshape(1,2)
....
ValueError: non-broadcastable output operand with shape (2,)
doesn't match the broadcast shape (1,2)
In [175]: a += a.reshape(2,1)
...
ValueError: non-broadcastable output operand with shape (2,)
doesn't match the broadcast shape (2,2)
With the a+=... addition, the result shape is fixed at (2,), the shape of a. But as noted above the two additions generate (1,2) and (2,2) results, which aren't compatible with (2,).
The same reasoning can explain these additions and errors:
In [176]: a1 = a.reshape(1,2)
In [177]: a1 += a
In [178]: a1
Out[178]: array([[12, 24]])
In [179]: a2 = a.reshape(2,1)
In [180]: a2 += a
...
ValueError: non-broadcastable output operand with shape (2,1)
doesn't match the broadcast shape (2,2)
In [182]: a1 += a2
...
ValueError: non-broadcastable output operand with shape (1,2)
doesn't match the broadcast shape (2,2)
In numpy.sum() there is parameter called keepdims. What does it do?
As you can see here in the documentation:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.sum.html
numpy.sum(a, axis=None, dtype=None, out=None, keepdims=False)[source]
Sum of array elements over a given axis.
Parameters:
...
keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as
dimensions with size one. With this option, the result will broadcast
correctly against the input array.
...
#Ney
#hpaulj is correct, you need to experiment, but I suspect you don't realize that summation for some arrays can occur along axes. Observe the following which reading the documentation
>>> a
array([[0, 0, 0],
[0, 1, 0],
[0, 2, 0],
[1, 0, 0],
[1, 1, 0]])
>>> np.sum(a, keepdims=True)
array([[6]])
>>> np.sum(a, keepdims=False)
6
>>> np.sum(a, axis=1, keepdims=True)
array([[0],
[1],
[2],
[1],
[2]])
>>> np.sum(a, axis=1, keepdims=False)
array([0, 1, 2, 1, 2])
>>> np.sum(a, axis=0, keepdims=True)
array([[2, 4, 0]])
>>> np.sum(a, axis=0, keepdims=False)
array([2, 4, 0])
You will notice that if you don't specify an axis (1st two examples), the numerical result is the same, but the keepdims = True returned a 2D array with the number 6, whereas, the second incarnation returned a scalar.
Similarly, when summing along axis 1 (across rows), a 2D array is returned again when keepdims = True.
The last example, along axis 0 (down columns), shows a similar characteristic... dimensions are kept when keepdims = True.
Studying axes and their properties is critical to a full understanding of the power of NumPy when dealing with multidimensional data.
An example showing keepdims in action when working with higher dimensional arrays. Let's see how the shape of the array changes as we do different reductions:
import numpy as np
a = np.random.rand(2,3,4)
a.shape
# => (2, 3, 4)
# Note: axis=0 refers to the first dimension of size 2
# axis=1 refers to the second dimension of size 3
# axis=2 refers to the third dimension of size 4
a.sum(axis=0).shape
# => (3, 4)
# Simple sum over the first dimension, we "lose" that dimension
# because we did an aggregation (sum) over it
a.sum(axis=0, keepdims=True).shape
# => (1, 3, 4)
# Same sum over the first dimension, but instead of "loosing" that
# dimension, it becomes 1.
a.sum(axis=(0,2)).shape
# => (3,)
# Here we "lose" two dimensions
a.sum(axis=(0,2), keepdims=True).shape
# => (1, 3, 1)
# Here the two dimensions become 1 respectively