I was confused with matrix operation on Python Numpy.
It seems that dot and outer operations don't behave like what I have learn in Linear Algebra class.
import numpy
n = numpy.arange(-5, 6)
w = numpy.arange(-20, 21)
n.shape
w.shape
outer = numpy.outer(w, n)
outer.shape
dot = numpy.dot(n, outer.transpose())
dot.shape
Here n is (11, 1) matrix, w is (41, 1) matrix. I think the size of w and n doesn't match.((41, 1) outer(11, 1))
Again, I think the dot is strange. n is (11, 1) matrix, outer.transpose() is (11, 41) matrix. I think the size is also not matched.
According to the documentation http://docs.scipy.org/doc/numpy/reference/generated/numpy.outer.html , the outer function of two row vectors A(1xn) and B(1xm) is a matrix M(nxm) - and the transpose will be of dimension mxn. This is exactly what you are seeing.
Thus, the dot product of a vector and a matrix is again described in the documentation: http://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html#numpy.dot - where it is essentially described as the matrix multiplication of the row vector (first argument) with the transpose of the second argument (matrix).
When I print out the shapes of the various objects your code creates, I get:
n.shape: (11,)
w.shape: (41,)
outer.shape: (41, 11)h
dot.shape: (41,)
Which is entirely consistent with the above. What is your confusion? What result is not what you were expecting?
Related
Can anyone explain the condition in which one would require to reshape along axis=0? Please see the example below, with a given numpy array:
a=np.array([1,2,3,4,5,6])
[1,2,3,4,5,6]
(Reshaping follows below)
a1 = np.expand_dims(a, axis=0)
[[1,2,3,4,5,6]]
The expansion typically happens when we are using a function, which performs operations on an (m, n) array, to process a special case where m = 1.
If the shape of the given data is (n,) we have to expand_dims along the first axis so that the shape is (1, n).
Some functions are nice enough to take special care of the (n,) situation. But sometimes we have to do the conversion, (n,) → (1, n), ourselves.
I'm currently learning about broadcasting in Numpy and in the book I'm reading (Python for Data Analysis by Wes McKinney the author has mentioned the following example to "demean" a two-dimensional array:
import numpy as np
arr = np.random.randn(4, 3)
print(arr.mean(0))
demeaned = arr - arr.mean(0)
print(demeaned)
print(demeand.mean(0))
Which effectively causes the array demeaned to have a mean of 0.
I had the idea to apply this to an image-like, three-dimensional array:
import numpy as np
arr = np.random.randint(0, 256, (400,400,3))
demeaned = arr - arr.mean(2)
Which of course failed, because according to the broadcasting rule, the trailing dimensions have to match, and that's not the case here:
print(arr.shape) # (400, 400, 3)
print(arr.mean(2).shape) # (400, 400)
Now, i have gotten it to work mostly, by substracting the mean from every single index in the third dimension of the array:
demeaned = np.ones(arr.shape)
for i in range(3):
demeaned[...,i] = arr[...,i] - means
print(demeaned.mean(0))
At this point, the returned values are very close to zero and i think, that's a precision error. Am i actually right with this thought or is there another caveat, that i missed?
Also, this doesn't seam to be the cleanest, most 'numpy'-way to achieve what i wanted to achieve. Is there a function or a principle that i can make use of to improve the code?
As of numpy version 1.7.0, np.mean, and several other functions, accept a tuple in their axis parameter. This means that you can perform the operation on the planes of the image all at once:
m = arr.mean(axis=(0, 1))
This mean will have shape (3,), with one element for each plane of the image.
If you want to subtract the means of each pixel individually, you have to remember that broadcasting aligns shape tuples on the right edge. That means that you need to insert an extra dimension:
n = arr.mean(axis=2)
n = n.reshape(*n.shape, 1)
Or
n = arr.mean(axis=2)[..., None]
Try np.apply_along_axis().
np.apply_along_axis(lambda x: x - np.mean(x), 2, arr)
Output: you get the array of the same shape where each cell is demeaned in the dimension you want (the second parameter, here it is 2).
I have numpy array 'test' of dimension (100, 100, 16, 16) which gives me a different 16x16 array for points on a 100x100 grid.
I also have some eigenvalues and vectors where vals has the dimension (100, 100, 16) and vecs (100, 100, 16, 16) where vecs[x, y, :, i] would be the ith eigenvector of the matrix at the point (x, y) corresponding to the ith eigenvalue vals[x, y, i].
Now I want to take the first eigenvector of the array at ALL points on the grid, do a matrix product with the test matrix and then do a scalar product of the resulting vector with all the other eigenvectors of the array at all points on the grid and sum them.
The resulting array should have the dimension (100, 100). After this I would like to take the 2nd eigenvector of the array, matrix multiply it with test and then take the scalar product of the result with all the eigenvectors that is not the 2nd and so on so that in the end I have 16 (100, 100) or rather a (100, 100, 16) array. I only succeded sofar with a lot of for loops which I would like to avoid, but using tensordot gives me the wrong dimension and I don't see how to pick the axis which is vectorized along for the np.dot function.
I heard that einsum might be suitable to this task, but everything that doesn't rely on the python loops is fine by me.
import numpy as np
from numpy import linalg as la
test = np.arange(16*16*100*100).reshape((100, 100, 16, 16))
vals, vecs = la.eig(test + 1)
np.tensordot(vecs, test, axes=[2, 3]).shape
>>> (10, 10, 16, 10, 10, 16)
EDIT: Ok, so I used np.einsum to get a correct intermediate result.
np.einsum('ijkl, ijkm -> ijlm', vecs, test)
But in the next step I want to do the scalarproduct only with all the other entries of vec. Can I implement maybe some inverse Kronecker delta in this einsum formalism? Or should I switch back to the usual numpy now?
Ok, I played around and with np.einsum I found a way to do what is described above. A nice feature of einsum is that if you repeat doubly occuring indices in the 'output' (so right of the '->'-thing) you can have element-wise multiplication along some and contraction along some other axes (something that you don't have in handwritten tensor algebra notation).
result = np.einsum('ijkl, ijlm -> ijkm', np.einsum('ijkl, ijkm -> ijlm', vecs, test), vecs)
This nearly does the trick. Now only the diagonal terms have to be taken out. We could do this by just substracting the diagonal terms like this:
result = result - result * np.eye(np.shape(test)[-1])[None, None, ...]
Im reading implementation of the Multinomial Naive Bayes and I do not understand how does this following calculation of dot product of the following matrixes work.
self.feature_count_ += safe_sparse_dot(Y.T, X)
Code can be found here
Where Y.T.shape = (3, 7000) and X.shape = (7000, 27860). How can it work when number of rows in the Y.T is not equal to number of columns in X? The size of the resulting matrix is (3, 27860) ?? How does it work? What am I missing?
Check out the "Mulitplying a matrix by another matrix" section here: https://www.mathsisfun.com/algebra/matrix-multiplying.html
If you go through the multiplication, you'll see that only the "inner" dimensions have to match (the 7000 in your case)
Assume we want to compute the dot product of a matrix and a column vector:
So in Numpy/Python here we go:
a=numpy.asarray([[1,2,3], [4,5,6], [7,8,9]])
b=numpy.asarray([[2],[1],[3]])
a.dot(b)
Results in:
array([[13],
[31],
[49]])
So far, so good, however why is this also working?
b=numpy.asarray([2,1,3])
a.dot(b)
Results in:
array([13, 31, 49])
I would expect that [2,1,3] is a row vector (which requires a transpose to apply the dot product), but Numpy seems to see arrays by default as column vectors (in case of matrix multiplication)?
How does this work?
EDIT:
And why is:
b=numpy.asarray([2,1,3])
b.transpose()==b
So the matrix dot vector array does work (so then it sees it as a column vector), however other operations (transpose) does not work. This is not really consistent design isn't it?
Let's first understand how the dot operation is defined in numpy.
(Leaving broadcasting rules out of the discussion, for simplicity) you can perform dot(A,B) if the last dimension of A (i.e. A.shape[-1]) is the same as the next-to-last dimension of B (i.e. B.shape[-2]) if B.ndim>=2, and simply the dimension of B if B.ndim==1.
In other words, if A.shape=(N1,...,Nk,X) and B.shape=(M1,...,M(j-1),X,Mj) (note the common X). The resulting array will have the shape (N1,...,Nk,M1,...,Mj) (note that X was dropped).
Or, if A.shape=(N1,...,Nk,X) and B.shape=(X,). The resulting array will have the shape (N1,...,Nk) (note that X was dropped).
Your examples work because they satisfy the rules (the first example satisfies the first, the second satisfies the second):
a=numpy.asarray([[1,2,3], [4,5,6], [7,8,9]])
b=numpy.asarray([[2],[1],[3]])
a.shape, b.shape, '->', a.dot(b).shape # X=3
=> ((3, 3), (3, 1), '->', (3, 1))
b=numpy.asarray([2,1,3])
a.shape, b.shape, '->', a.dot(b).shape # X=3
=> ((3, 3), (3,), '->', (3,))
My recommendation is that, when using numpy, don't think in terms of "row/column vectors", and if possible don't think in terms of "vectors" at all, but in terms of "an array with shape S". This means that both row vectors and column vectors are simply "1dim arrays". As far as numpy is concerned, they are one and the same.
This should also make it clear why in your case b.transponse() is the same as b. b being a 1dim array, when transposed, remains a 1dim array. Transpose doesn't affect 1dim arrays.