I recently encountered a problem of the inaccuracy of the matrix product/multiplication in NumPy. See my example below also here https://trinket.io/python3/6a4c22e450
import numpy as np
para = np.array([[ 3.28522453e+08, -1.36339334e+08, 1.36339334e+08],
[-1.36339334e+08, 5.65818682e+07, -5.65818682e+07],
[ 1.36339334e+08, -5.65818672e+07, 5.65818682e+07]])
in1 = np.array([[ 285.91695469],
[ 262.3 ],
[-426.64380594]])
in2 = np.array([[ 285.91695537],
[ 262.3 ],
[-426.64380443]])
(in1 - in2)/in1
>>> array([[-2.37831286e-09],
[ 0.00000000e+00],
[ 3.53925214e-09]])
The difference between in1 and in2 is very small, which is ~10^-9
res1 = para # in1
>>> array([[-356.2361908 ],
[ 443.16068268],
[-180.86068344]])
res2 = para # in2
>>> array([[ 73.03147125],
[265.01131439],
[ -2.71131516]])
but after the matrix multiplication, why does the difference between the output res1 and res2 change so much?
(res1 - res2)/res1
>>> array([[1.20500857],
[0.40199723],
[0.98500882]])
This is not a bug; it is to be expected with a matrix such as yours.
Your matrix (which is symmetric) has one large and two small eigenvalues:
In [34]: evals, evecs = np.linalg.eigh(para)
In [35]: evals
Out[35]: array([-1.06130078e-01, 1.00000000e+00, 4.41686189e+08])
Because the matrix is symmetric, it can be diagonalized with an orthonormal basis. That just means that we can define a new coordinate system in which the matrix is diagonal, and the diagonal values are those eigenvalues. The effect of multiplying the matrix by a vector in these coordinates is to simply multiply each coordinate by the corresponding eigenvalue, i.e. the first coordinate is multiplied by -0.106, the second coordinate doesn't change, and the third coordinate is multiplied by the large factor 4.4e8.
The reason you get such a drastic change when multiplying the original matrix para by in1 and in2 is that, in the new coordinates, the third component of the transformed in1 is positive, and the third component of the transformed in2 is negative. (That is, the points are on opposite sides of the 2-d eigenspace associated with the two smaller eigenvalues.) There are several ways to find these transformed coordinates; one is to compute inv(V)#x, where V is the matrix of eigenvectors:
In [36]: np.linalg.solve(evecs, in1)
Out[36]:
array([[ 5.64863071e+02],
[-1.16208620e+02],
[ 8.55527517e-07]])
In [37]: np.linalg.solve(evecs, in2)
Out[37]:
array([[ 5.64863070e+02],
[-1.16208619e+02],
[-2.71381169e-07]])
Note the different signs of the third components. The values are small, but when you multiply by the diagonal matrix, they are multiplied by 4.4e8, giving 377.87 and -119.86, respectively. That large change shows up as the results that you observed in the original coordinates.
For a rougher calculation: note that the elements of para are ~10^8, so multiplication on that order of magnitude occurs when you compute para # x. It is not surprising then, that given the relative differences between in1 and in2 are ~10^-9, the relative differences of res1 and res2 will be ~10^-9 * ~10^8 or ~0.1. (Your calculated relative errors were [1.2, 0.4, 0.99], so the rough estimate is in the right ballpark.)
This looks like a bug ... numpy is written in C, so this could be an issue of casting number into smaller float, which causes big floating point error in this case
I have an array with shape (128,116,116,1), where 1st dimension asthe number of subjects, with the 2nd and 3rd being the data.
I was trying to calculate the variance (squared deviation from the mean) at each position (i.e: in (0,0), (0,1), (1,0), etc... until (116,116)) for all the 128 subjects, resulting in an array with shape (116,116).
Can anyone tell me how to accomplish this?
Thank you!
Let's say we have a multidimensional list a of shape (3,2,2)
import numpy as np
a =
[
[
[1,1],
[1,1]
],
[
[2,2],
[2,2]
],
[
[3,3],
[3,3]
],
]
np.var(a, axis = 0) # results in:
> array([[0.66666667, 0.66666667],
> [0.66666667, 0.66666667]])
If you want to efficiently compute the variance across all 128 subjects (which would be axis 0), I don't see a way to do it using the statistics package since it doesn't take multi-lists as input. So you will have to write your own code/logic and add loops on the subjects.
But, using the numpy.var
function, we can easily calculate the variance of each 'datapoint' (tuples of indices) across all 128 subjects.
Side note: You mentioned statistics.variance. However, that is only to be used when you are taking a sample from a population as is mentioned in the documentation you linked. If you were to go the manual route, you would use statistics.pvariance instead, since we are calculating it on the whole dataset.
The difference can be seen here:
statistics.pvariance([1,2,3])
> 0.6666666666666666 # (correct)
statistics.variance([1,2,3])
> 1 # (incorrect)
np.var([1,2,3])
> 0.6666666666666666 # (np.var also gives the correct output)
I am learning SVD by following this MIT course.
the Matrix is constructed as
C = np.matrix([[5,5],[-1,7]])
C
matrix([[ 5, 5],
[-1, 7]])
the lecturer gives the V as
this is close to
w, v = np.linalg.eig(C.T*C)
matrix([[-0.9486833 , -0.31622777],
[ 0.31622777, -0.9486833 ]])
but np.linalg.svd(C) gives a different output
u, s, vh = np.linalg.svd(C)
vh
matrix([[ 0.31622777, 0.9486833 ],
[ 0.9486833 , -0.31622777]])
it seems the vh exchange the elements in the V vector, is it acceptable?
did I do and understand this correctly?
For linalg.eig your Eigenvalues are stored in w. These are:
>>> w
array([20., 80.])
For your singular value decomposition you can get your Eigenvalues by squaring your singular values (C is invertible so everything is easy here):
>>> s**2
array([80., 20.])
As you can see their order is flipped.
From the linalg.eig documentation:
The eigenvalues are not necessarily ordered
From the linalg.svd documentation:
Vector(s) with the singular values, within each vector sorted in descending order. ...
In general routines that give you Eigenvalues and Eigenvectors do not "sort" them necessarily the way you might want them. So it is always important to make sure you have the Eigenvector for the Eigenvalue you want. If you need them sorted (e.g. by Eigenvalue magnitude) you can always do this yourself (see here: sort eigenvalues and associated eigenvectors after using numpy.linalg.eig in python).
Finally note that the rows in vh contain the Eigenvectors, whereas in v it's the columns.
So that means that e.g.:
>>> v[:,0].flatten()
matrix([[-0.9486833 , 0.31622777]])
>>> vh[1].flatten()
matrix([[ 0.9486833 , -0.31622777]])
give you both the Eigenvector for the Eigenvalue 20.
I want to multiply an n-dim stack of m* m matrices by an n-dim stack of vectors (length m), so that the resulting m*n array contains the result of the dot product of the matrix and vector in the nth entry:
vec1=np.array([0,0.5,1,0.5]); vec2=np.array([2,0.5,1,0.5])
vec=np.transpose(n.stack((vec1,vec2)))
mat = np.moveaxis(n.array([[[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3]],[[-1,2.,0,1.],[0,0,-1,2.],[0,1,-1,2.],[1,0.1,1,1]]]),0,2)
outvec=np.zeros((4,2))
for i in range(2):
outvec[:,i]=np.dot(mat[:,:,i],vec[:,i])
Inspired by this post Element wise dot product of matrices and vectors, I have tried all different perturbations of index combinations in einsum, and have found that
np.einsum('ijk,jk->ik',mat,vec)
gives the correct result.
Unfortunately I really do not understand this - I assumed the fact that I repeat the entry k in the 'ijk,jk' part means that I multiply AND sum over k. I've tried to read the documentation https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.einsum.html, but I still don't understand.
(My previous attempts included,
np.einsum('ijk,il->ik', mat, vec)
I'm not even sure what this means. What happens to the index l when I drop it?)
Thanks in advance!
Read up on Einstein summation notation.
Basically, the rules are:
Without a ->
Any letter repeated in the inputs represents an axis to be multipled and summed over
Any letter not repeated in the inputs is included in the output
With a ->
Any letter repeated in the inputs represents an axis to be multipled over
Any letter not in the output represents an axis to be summed over
So, for example, with matrices A and B wih same shape:
np.einsum('ij, ij', A, B) # is A ddot B, returns 0d scalar
np.einsum('ij, jk', A, B) # is A dot B, returns 2d tensor
np.einsum('ij, kl', A, B) # is outer(A, B), returns 4d tensor
np.einsum('ji, jk, kl', A, B) # is A.T # B # A, returns 2d tensor
np.einsum('ij, ij -> ij', A, B) # is A * B, returns 2d tensor
np.einsum('ij, ij -> i' , A, A) # is norm(A, axis = 1), returns 1d tensor
np.einsum('ii' , A) # is tr(A), returns 0d scalar
In [321]: vec1=np.array([0,0.5,1,0.5]); vec2=np.array([2,0.5,1,0.5])
...: vec=np.transpose(np.stack((vec1,vec2)))
In [322]: vec1.shape
Out[322]: (4,)
In [323]: vec.shape
Out[323]: (4, 2)
A nice thing about the stack function is we can specify an axis, skipping the transpose:
In [324]: np.stack((vec1,vec2), axis=1).shape
Out[324]: (4, 2)
Why the mix of np. and n.? NameError: name 'n' is not defined. That kind of thing almost sends me away.
In [326]: mat = np.moveaxis(np.array([[[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3]],[[-1,2.,0
...: ,1.],[0,0,-1,2.],[0,1,-1,2.],[1,0.1,1,1]]]),0,2)
In [327]: mat.shape
Out[327]: (4, 4, 2)
In [328]: outvec=np.zeros((4,2))
...: for i in range(2):
...: outvec[:,i]=np.dot(mat[:,:,i],vec[:,i])
...:
In [329]: outvec
Out[329]:
array([[ 4. , -0.5 ],
[ 4. , 0. ],
[ 4. , 0.5 ],
[ 4. , 3.55]])
In [330]: # (4,4,2) (4,2) 'kji,ji->ki'
From your loop, the location of the i axis (size 2) is clear - last in all 3 arrays. That leaves one axis for vec, lets call that j. It pairs with the last (next to i of mat). k carries over from mat to outvec.
In [331]: np.einsum('kji,ji->ki', mat, vec)
Out[331]:
array([[ 4. , -0.5 ],
[ 4. , 0. ],
[ 4. , 0.5 ],
[ 4. , 3.55]])
Often the einsum string writes itself. For example if mat was described as (m,n,k) and vec as (n,k), with the result being (m,k)
In this case only the j dimension is summed - it appears on the left, but on the right. The last dimension, i in my notation, is not summed because if appears on both sides, just as it does in your iteration. I think of that as 'going-along-for-the-ride'. It isn't actively part of the dot product.
You are, in effect, stacking on the last dimension, size 2 one. Usually we stack on the first, but you transpose both to put that last.
Your 'failed' attempt runs, and can be reproduced as:
In [332]: np.einsum('ijk,il->ik', mat, vec)
Out[332]:
array([[12. , 4. ],
[ 6. , 1. ],
[12. , 4. ],
[ 6. , 3.1]])
In [333]: mat.sum(axis=1)*vec.sum(axis=1)[:,None]
Out[333]:
array([[12. , 4. ],
[ 6. , 1. ],
[12. , 4. ],
[ 6. , 3.1]])
The j and l dimensions don't appear on the right, so they are summed. They can be summed before multiplying because they appear in only one term each. I added the None to enable broadcasting (multiplying a ik with i).
np.einsum('ik,i->ik', mat.sum(axis=1), vec.sum(axis=1))
If you'd stacked on the first, and added a dimension for vec (2,4,1), it would matmul with a (2,4,4) mat. mat # vec[...,None].
In [337]: m1 = mat.transpose(2,0,1)
In [338]: m1#v1[...,None]
Out[338]:
array([[[ 4. ],
[ 4. ],
[ 4. ],
[ 4. ]],
[[-0.5 ],
[ 0. ],
[ 0.5 ],
[ 3.55]]])
In [339]: _.shape
Out[339]: (2, 4, 1)
einsum is easy (when you had played with permutation of indices for a while, that is...).
Let's work with something simple, a triple stack of 2×2 matrices and a triple stack of 2×, arrays
import numpy as np
a = np.arange(3*2*2).reshape((3,2,2))
b = np.arange(3*2).reshape((3,2))
We need to know what we are going to compute using einsum
In [101]: for i in range(3):
...: print(a[i]#b[i])
[1 3]
[23 33]
[77 95]
What we have done? we have an index i that is fixed when we perform a dot product between one of the stacked matrices and one of the stacked vectors (both indexed by i) and the individual output line implies a summation over the last index of the stacked matrix and the lone index of the stacked vector.
This is easily encoded in an einsum directive
we want the same i index to specify the matrix, the vector and also the output,
we want to reduce along the last matrix index and the remaining vector index, say k
we want to have as many columns in the output as the rows in each stacked matrix, say j
Hence
In [102]: np.einsum('ijk,ik->ij', a, b)
Out[102]:
array([[ 1, 3],
[23, 33],
[77, 95]])
I hope that my discussion of how I got the directive right is clear, correct and useful.
I have an array of shape (3,2):
import numpy as np
arr = np.array([[0.,0.],[0.25,-0.125],[0.5,-0.125]])
I was trying to build a matrix (matrix) of dimensions (6,2), with the results of the outer product of the elements i,i of arr and arr.T. At the moment I am using a for loop such as:
size = np.shape(arr)
matrix = np.zeros((size[0]*size[1],size[1]))
for i in range(np.shape(arr)[0]):
prod = np.outer(arr[i],arr[i].T)
matrix[size[1]*i:size[1]+size[1]*i,:] = prod
Resulting:
matrix =array([[ 0. , 0. ],
[ 0. , 0. ],
[ 0.0625 , -0.03125 ],
[-0.03125 , 0.015625],
[ 0.25 , -0.0625 ],
[-0.0625 , 0.015625]])
Is there any way to build this matrix without using a for loop (e.g. broadcasting)?
Extend arrays to 3D with None/np.newaxis keeping the first axis aligned, while letting the second axis getting pair-wise multiplied, perform multiplication leveraging broadcasting and reshape to 2D -
matrix = (arr[:,None,:]*arr[:,:,None]).reshape(-1,arr.shape[1])
We can also use np.einsum -
matrix = np.einsum('ij,ik->ijk',arr,arr).reshape(-1,arr.shape[1])
einsum string representation might be more intuitive as it lets us visualize three things :
Axes that are aligned (axis=0 here).
Axes that are getting summed up (none here).
Axes that are kept i.e. element-wise multiplied (axis=1 here).