If I have a numpy array X with X.shape=(m,n) and a second column vector y with y.shape=(m,1), how can I calculate the covariance of each column of X with y wihtout using a for loop? I expect the result to be of shape (m,1) or (1,m).
Assuming that the output is meant to be of shape (1,n) i.e. a scalar each for covariance operation for each column of A with B and thus for n columns ending up with n such scalars, you can use two approaches here that use covariance formula.
Approach #1: With Broadcasting
np.sum((A - A.mean(0))*(B - B.mean(0)),0)/B.size
Approach #2: With Matrix-multiplication
np.dot((B - B.mean(0)).T,(A - A.mean(0)))/B.size
Related
As we know that In Linear Algebra it is mandatory to multiply a vector by matrix or multiply two matrices, the number of rows of one matrix or vector must be equal to the number of columns in other vector or matrix.
while i was working in numpy python and it is giving me a different result.
Here is my code and it works.
np.array([1,2]) * np.array([[1],[2],[3]])
so is there any difference between numpy vector to matrix
matlication vs linear algebra vector to matrix multiplication.
use numpy np.dot(a,b)
Use the following code and you will get error you want.
np.dot(np.array([1,2]) , np.array([[1],[2],[3]]))
Becuase *,+,-,/ works element-wise on arrays.
If either a or b is 0-D (scalar), it is equivalent to multiply and
using numpy.multiply(a, b) or a * b is preferred.
Suppose that we are given a two dimensional matrix A of dtype=uint8 with N rows and M columns and a uint8 vector of size N called x. We need to bit-wise XOR each row of A, e.g. A[i], with the corresponding element in x, i.e. x[i].
Currently, I am doing this as follows, but think that there are more efficient ways of doing that with numpy vectorization capabilities.
for i in range(A.shape[0]):
A[i,:] = np.bitwise_xor(A[i,:], x[i]
This is the row wised XOR. Besides this, this XOR needs to be applied column-wise, too.
Thanks in advance.
Let X be a M x N matrix. Denote xi the i-th column of X. I want to create a 3 dimensional N x M x M array consisting of M x M matrices xi.dot(xi.T).
How can I do it most elegantly with numpy? Is it possible to do this using only matrix operations, without loops?
One approach with broadcasting -
X.T[:,:,None]*X.T[:,None]
Another with broadcasting and swapping axes afterwards -
(X[:,None,:]*X).swapaxes(0,2)
Another with broadcasting and a multi-dimensional transpose afterwards -
(X[:,None,:]*X).T
Another approach with np.einsum, which might be more intuitive thinking in terms of the iterators involved if you are translating from a loopy code -
np.einsum('ij,kj->jik',X,X)
Basic idea in all of these approaches is that we spread out the last axis for elementwise multiplication against each other keeping the first axis aligned. We achieve this process of putting against each other by extending X to two 3D array versions.
I have two numpy arrays. 'A' of size w,h,2 and 'B' with n,2.
In other words, A is a 2-dimensional array of 2D vectors while B is a 1D array of 2D vectors.
What i want as a result is an array of size w,h,n. The last dimension is an n-dimensional vector where each of the components is the euclidean distance between the corresponding vector from A (denoted by the first two dimensions w and h) and the nth vector of B.
I know that i can just loop through w, h and n in python manually and calculate the distance for each element, but i like to know if there is a smart way to do that with numpy operations to increase performance.
I found some similar questions but unfortunately all of those use input arrays of the same dimensionality.
Approach #1
You could reshape A to 2D, use Scipy's cdist that expects 2D arrays as inputs, get those euclidean distances and finally reshape back to 3D.
Thus, an implementation would be -
from scipy.spatial.distance import cdist
out = cdist(A.reshape(-1,2),B).reshape(w,h,-1)
Approach #2
Since, the axis of reduction is of length 2 only, we can just slice the input arrays to save memory on intermediate arrays, like so -
np.sqrt((A[...,0,None] - B[:,0])**2 + (A[...,1,None] - B[:,1])**2)
Explanation on A[...,0,None] and A[...,1,None] :
With that None we are just introducing a new axis at the end of sliced A. Well, let's take a small example -
In [54]: A = np.random.randint(0,9,(4,5,2))
In [55]: A[...,0].shape
Out[55]: (4, 5)
In [56]: A[...,0,None].shape
Out[56]: (4, 5, 1)
In [57]: B = np.random.randint(0,9,(3,2))
In [58]: B[:,0].shape
Out[58]: (3,)
So, we have :
A[...,0,None] : 4 x 5 x 1
B[:,0] : 3
That is essentially :
A[...,0,None] : 4 x 5 x 1
B[:,0] : 1 x 1 x 3
When the subtraction is performed, the singleton dims are broadcasted corresponding to the dimensions of the other participating arrays -
A[...,0,None] - B : 4 x 5 x 3
We repeat this for the second index along the last axis. We add these two arrays after squaring and finally a square-root to get the final eucl. distances.
I would like to index a column vector in a matrix in Python/numpy and have it returned as a column vector and not a 1D array.
x = np.array([[1,2],[3,4]])
x[:,1]
>array([2, 4])
Giving
np.transpose(x[:,1])
is not a solution. Following the numpy.transpose documentation, this will return a row vector (1-D array).
Few options -
x[:,[1]]
x[:,None,1]
x[:,1,None]
x[:,1][:,None]
x[:,1].reshape(-1,1)
x[None,:,1].T