Element wise divide like MATLAB's ./ operator? - python

I am trying to normalize some Nx3 data. If X is a Nx3 array and D is a Nx1 array, in MATLAB, I can do
Y = X./D
If I do the following in Python, I get an error
X = np.random.randn(100,3)
D = np.linalg.norm(X,axis=1)
Y = X/D
ValueError: operands could not be broadcast together with shapes (100,3) (100,)
Any suggestions?
Edit: Thanks to dm2.
Y = X/D.reshape((100,1))
Another way is to use scikitlearn.
from sklearn import preprocessing
Y = preprocessing.normalize(X)

From numpy documentation on array broadcasting:
When operating on two arrays, NumPy compares their shapes
element-wise. It starts with the trailing (i.e. rightmost) dimensions
and works its way left. Two dimensions are compatible when
they are equal, or
one of them is 1
Both of your arrays have the same first dimension, but your X array is 2-dimensional, while your D array is 1-dimensional, which means the shapes of these two arrays do not meet the requirements to be broadcast together.
To make sure they do, you could reshape your D array into a 2-dimensional array of shape (100,1), which would satisfy the requirements to broadcast: rightmost dimensions are 3 and 1 (one of them is 1) and the other dimensions are equal (100 and 100).
So:
Y = X/D.reshape((-1,1))
or
Y = X/D.reshape((100,1))
or
Y = X/D[:,np.newaxis]
Should give you the result you're after.

Related

Numpy docs: How to multiply 2 arrays of different sizes together?

Numpy docs claims you can multiply arrays of different lengths together, however it is not working. I'm definitely misinterpreting what its saying but there's no example to go with their text. From the docs here:
Therefore, I created some code to try it out but I'm getting an error that says ValueError: operands could not be broadcast together with shapes (4,1) (3,1). Same error if I try this with shapes (4,) and (3,).
a = np.array([[1.0],
[1.0],
[1.0],
[1.0]])
print(a.shape)
b = np.array([[2.0],
[2.0],
[2.0]])
print(b.shape)
a*b
You can multiply arrays together if every dimenssion has the same length or one of the arrays has dimension 1 in the current axis.
in your example the arrays has sizes 4x1 and 3x1. So if you want to multiply them together you need to transpose one:
a = np.array([[1.0],
[1.0],
[1.0],
[1.0]])
print(a.shape)
b = np.array([[2.0],
[2.0],
[2.0]])
print(b.shape)
a*b.T
So its dimensions are shared with 1 in the other array 4x1 and 1x3 now and the result will have size 4x3
Copying and pasting the immediately previous text, in the same document, with my own emphasis:
When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing (i.e. rightmost) dimensions and works its way left. Two dimensions are compatible when
they are equal, or
one of them is 1
If these conditions are not met, a ValueError: operands could not be broadcast together exception is thrown, indicating that the arrays have incompatible shapes. The size of the resulting array is the size that is not 1 along each axis of the inputs.
Arrays do not need to have the same number of dimensions. For example, if you have a 256x256x3 array of RGB values, and you want to scale each color in the image by a different value, you can multiply the image by a one-dimensional array with 3 values. Lining up the sizes of the trailing axes of these arrays according to the broadcast rules, shows that they are compatible:
Image (3d array): 256 x 256 x 3
Scale (1d array): 3
Result (3d array): 256 x 256 x 3
When either of the dimensions compared is one, the other is used. In other words, dimensions with size 1 are stretched or “copied” to match the other.
Now, let's try applying this logic to the example data.
A (2d array): 4 x 1
B (2d array): 3 x 1
Look at the first dimension: the lengths are 4 and 3. Is 4 equal to 3? No. Is either of those equal to 1? No. Therefore, the conditions are not met. We cannot broadcast along the first dimension of the array because there is not a rule that tells us how to match up 4 values against 3. If it were 4 values against 4, or 3 against 3, we could pair them up directly. If it were 4 against 1, or 1 against 3, we could "broadcast" by repeating the single value. Neither case applies here.
We could, however, multiply if either of the arrays were transposed:
A.T (2d array): 1 x 4
B (2d array): 3 x 1
A (2d array): 4 x 1
B.T (2d array): 1 x 3
Verifying this is left as an exercise for the reader.

nested operations on two numpy arrays, one 2d and one 1d

Say I have one 2d numpy array X with shape (3,3) and one numpy array Y with shape (3,) where
X = np.array([[0,1,2],
[3,4,5],
[1,9,2]])
Y = np.array([[1,0,1]])
How can I create a numpy array, Z for example, from multiplying X,Y element-wise and then summation row-wise?
multiplying element-wise would yield: 0,0,2, 3,0,5, 1,0,2
then, adding each row would yield:
Z = np.array([2,8,3])
I have tried variations of
Z = np.sum(X * Y) --> adds all elements of entire array, not row-wise.
I know I can use a forloop but the dataset is very large and so I am trying to find a more efficient numpy-specific way to perform the operation. Is this possible?
You can do the following:
sum_row = np.sum(X*Y, axis=1) # axis=0 for columnwise

Dynamically broadcast a numpy array

I currently have a 1D numpy array, epsilons, that needs to perform element-wise multiplication on array x. However, the dimensionality of x is dynamic and changes with each iteration of the following for loop:
for x in grads:
x = x * epsilons
print(grad)
epsilons always has the shape (M,). However, for the first iteration, x takes the shape (M,4,2) while it takes the shape (M,4) for the second iteration (the shape of x changes as the code iterates over grads). Is there a way I can automatically broadcast epsilons to the shape of x so that I can perform this element-wise multiplication for any shape of x?
You can just reshape epsilons to the correct shape. Indeed, Numpy automatically broadcast the vector shape (like the broadcast_to call) if is has a compatible shape: the same number of dimension should be at least the same and the shape should be either 1 of full for each dimension.
Thanks to #hpaulj for the improved solution.
# Reshape epsilons so that the vector value are along the first dimension (the least contiguous one)
reshapedEpsilons = epsilons.reshape((M,)+(1,)*(x.ndim-1))
# Broadcast automatically the vector values in the other dimensions so the result have the same shape than x
# Actual element-wise multiplication
x *= reshapedEpsilons
PS: note that a = a * b should create a new matrix and is less efficient than a *= b which modify the values in-place.

Multiplying 3D matrix with 2D matrix

I have two matrices to multiply. One is the weight matrix W, whose size is 900x2x2. Another is input matrix I, whose size is 2x2.
I want to perform a summation over c = WI which will be a 900x1 matrix, but when I perform the operation it multiplies them and gives me a 900x2x2 matrix again.
Question #2 (related): So I made both of them 2D and multiplied 900x4 * 4x1, but that gives me an error saying:
ValueError: operands could not be broadcast together with shapes (900,4) (4,1)
It seems you are trying to lose the last two axes of the first array against the only two axes of the second weight array with that matrix-multiplication. We could translate that idea into NumPy code with np.tensordot and assuming arr1 and arr2 as the input arrays respectively, like so -
np.tensordot(arr1,arr2,axes=([1,2],[0,1]))
Another simpler way to put into NumPy code would be with np.einsum, like so -
np.einsum('ijk,jk',arr1,arr2)

How to assign a 1D numpy array to 2D numpy array?

Consider the following simple example:
X = numpy.zeros([10, 4]) # 2D array
x = numpy.arange(0,10) # 1D array
X[:,0] = x # WORKS
X[:,0:1] = x # returns ERROR:
# ValueError: could not broadcast input array from shape (10) into shape (10,1)
X[:,0:1] = (x.reshape(-1, 1)) # WORKS
Can someone explain why numpy has vectors of shape (N,) rather than (N,1) ?
What is the best way to do the casting from 1D array into 2D array?
Why do I need this?
Because I have a code which inserts result x into a 2D array X and the size of x changes from time to time so I have X[:, idx1:idx2] = x which works if x is 2D too but not if x is 1D.
Do you really need to be able to handle both 1D and 2D inputs with the same function? If you know the input is going to be 1D, use
X[:, i] = x
If you know the input is going to be 2D, use
X[:, start:end] = x
If you don't know the input dimensions, I recommend switching between one line or the other with an if, though there might be some indexing trick I'm not aware of that would handle both identically.
Your x has shape (N,) rather than shape (N, 1) (or (1, N)) because numpy isn't built for just matrix math. ndarrays are n-dimensional; they support efficient, consistent vectorized operations for any non-negative number of dimensions (including 0). While this may occasionally make matrix operations a bit less concise (especially in the case of dot for matrix multiplication), it produces more generally applicable code for when your data is naturally 1-dimensional or 3-, 4-, or n-dimensional.
I think you have the answer already included in your question. Numpy allows the arrays be of any dimensionality (while afaik Matlab prefers two dimensions where possible), so you need to be correct with this (and always distinguish between (n,) and (n,1)). By giving one number as one of the indices (like 0 in 3rd row), you reduce the dimensionality by one. By giving a range as one of the indices (like 0:1 in 4th row), you don't reduce the dimensionality.
Line 3 makes perfect sense for me and I would assign to the 2-D array this way.
Here are two tricks that make the code a little shorter.
X = numpy.zeros([10, 4]) # 2D array
x = numpy.arange(0,10) # 1D array
X.T[:1, :] = x
X[:, 2:3] = x[:, None]

Categories