Numpy calculation of eigenvectors is incorrect - python

I run the following in Python and expected the columns in E[1] to be the eigenvectors of A, but they are not. Only Sympy.Matrix.eigenvects() seem to do it right. Why this error?
A
Out[194]:
matrix([[-3, 3, 2],
[ 1, -1, -2],
[-1, -3, 0]])
E = np.linalg.eig(A)
E
Out[196]:
(array([ 2., -4., -2.]),
matrix([[ -2.01889132e-16, 9.48683298e-01, 8.94427191e-01],
[ 5.54700196e-01, -3.16227766e-01, -3.71551690e-16],
[ -8.32050294e-01, 2.73252305e-17, 4.47213595e-01]]))
A*E[1] / E[1]
Out[205]:
matrix([[ 6.59900617, -4. , -2. ],
[ 2. , -4. , -3.88449298],
[ 2. , 8.125992 , -2. ]])

The eigenvectors are correct, within an expected margin of error.
What you discovered is that testing eigenvectors with element-wise division is a bad idea.
A better way is to compute the norm of the difference between matrix*vector and eigenvalue*vector.
NumPy performs computations in floating point arithmetics, limited to 52 bits of precision (double precision). This means any of its answers may contain numerical errors, at least of relative size 2**(-52) which is about 2e-16. So, when you see a number like 2e-16 coming from a calculation with numbers of size 1-3, the conclusion is: "that number should probably be zero, and the value we have for it is likely just noise". And if you divide by that number, noise is all you get.
SymPy, on the other hand, performs symbolic manipulations, so its answer (when it can get one) is exactly what the theory predicts.

From its docs:
The number w is an eigenvalue of a if there exists a vector v such that dot(a,v) = w * v. Thus, the arrays a, w, and v satisfy the equations dot(a[:,:], v[:,i]) = w[i] * v[:,i] for i \in {0,...,M-1}.
With your matrix:
In [1]: A = np.array([[-3, 3, 2],
...: [ 1, -1, -2],
...: [-1, -3, 0]])
...:
In [2]: w,v=np.linalg.eig(A)
In [3]: w
Out[3]: array([ 2., -4., -2.])
In [4]: v
Out[4]:
array([[ -9.39932874e-17, 9.48683298e-01, 8.94427191e-01],
[ 5.54700196e-01, -3.16227766e-01, 1.93473310e-16],
[ -8.32050294e-01, -4.08811066e-17, 4.47213595e-01]])
In [5]: np.dot(A,v)
Out[5]:
array([[ -2.22044605e-16, -3.79473319e+00, -1.78885438e+00],
[ 1.10940039e+00, 1.26491106e+00, -7.77156117e-16],
[ -1.66410059e+00, 4.44089210e-16, -8.94427191e-01]])
In [6]: w*v
Out[6]:
array([[ -1.87986575e-16, -3.79473319e+00, -1.78885438e+00],
[ 1.10940039e+00, 1.26491106e+00, -3.86946619e-16],
[ -1.66410059e+00, 1.63524427e-16, -8.94427191e-01]])
In [7]: np.dot(A,v)-w*v
Out[7]:
array([[ -3.40580301e-17, 8.88178420e-16, 2.22044605e-16],
[ 8.88178420e-16, -6.66133815e-16, -3.90209498e-16],
[ -2.22044605e-16, 2.80564783e-16, -3.33066907e-16]])
In [8]: np.allclose(np.dot(A,v), w*v)
Out[8]: True
So, yes, the documented test is satisfied, within floating point limits.
einsum can be used to highlight the i axis in the dot calculation.
In [10]: np.einsum('...k,ki->...i',A,v)
Out[10]:
array([[ -2.22044605e-16, -3.79473319e+00, -1.78885438e+00],
[ 1.10940039e+00, 1.26491106e+00, -7.77156117e-16],
[ -1.66410059e+00, 3.88578059e-16, -8.94427191e-01]])
When I divide by v (element wise), the result matches the eigenvalues, 2 -4,-2, except where v and the dot are virtually 0 (1e-16 or smaller).
In [11]: np.einsum('...k,ki->...i',A,v)/v
Out[11]:
array([[ 2.36234534, -4. , -2. ],
[ 2. , -4. , -4.01686475],
[ 2. , -9.50507681, -2. ]])

Related

Scipy: Calculation of standardized euclidean via cdist

The formula is available in the docs and pointed to in this answer. However when I'm trying to apply it I'm not getting a matching answer. I'm sure there's some silly mistake I'm making somewhere so thanks for bearing with me:
Setup
Say I have 2 matrices:
X: array([[0, 1, 0],
[1, 1, 1]])
X2: array([[1, 1, 0],
[1, 1, 1],
[1, 2, 0]])
Now applying Xans = scipy.spatial.distance.cdist(X, X2, 'seuclidean') gives:
Xans: array([[2.23606798, 2.88675135, 3.16227766],
[1.82574186, 0. , 2.88675135]])
Let's just focus on Xans[0][0] = 2.23606798, which should have been obtained by applying seuclidean(X[0], X2[0]).
Method 1: Using pdist
I tried doing this via pdist but get a NaN:
In [104]: scipy.spatial.distance.pdist([X[0], X2[0]], metric='seuclidean')
Out[104]: array([nan])
Why is this happening?
Method 2: Direct Formula Application
I tried manually using the formula linked in the answer above as follows:
In [107]: (((X[0] - X2[0])**2).sum()/(np.var([X[0], X2[0]])))**0.5
Out[107]: 2.0
As can be seen this is giving 2.0?
I'm clearly doing something very wrong - What is it?
The standardized Euclidean distance weights each variable with a separate variance. If you don't provide the variances with the V argument, it computes them from the input array. This is mentioned in the pdist docstring in the "Parameters" section under **kwargs, where it shows:
V : ndarray
The variance vector for standardized Euclidean.
Default: var(X, axis=0, ddof=1)
For example:
In [39]: A
Out[39]:
array([[3, 0, 2],
[2, 1, 2],
[0, 0, 1],
[3, 1, 2],
[1, 0, 0]])
In [40]: from scipy.spatial.distance import pdist
In [41]: pdist(A, metric='seuclidean')
Out[41]:
array([ 1.98029509, 2.55814731, 1.82574186, 2.71163072, 2.63368079,
0.76696499, 2.9868995 , 3.14284123, 1.35581536, 3.26898677])
We get the same result if we provide the variances computed as explained in the docstring:
In [42]: pdist(A, metric='seuclidean', V=np.var(A, axis=0, ddof=1))
Out[42]:
array([ 1.98029509, 2.55814731, 1.82574186, 2.71163072, 2.63368079,
0.76696499, 2.9868995 , 3.14284123, 1.35581536, 3.26898677])
Of course, if you provide variances that are all 1, you get the regular Euclidean distance:
In [43]: pdist(A, metric='seuclidean', V=np.ones(A.shape[1]))
Out[43]:
array([ 1.41421356, 3.16227766, 1. , 2.82842712, 2.44948974,
1. , 2.44948974, 3.31662479, 1.41421356, 3. ])
In [44]: pdist(A, metric='euclidean')
Out[44]:
array([ 1.41421356, 3.16227766, 1. , 2.82842712, 2.44948974,
1. , 2.44948974, 3.31662479, 1.41421356, 3. ])
The problem with your "Method 1" is that in your input array of just two points (i.e. [X[0], X2[0]]), the second and third components of the points don't change, so the variance associated with those components is 0:
In [45]: p = np.array([X[0], X2[0]])
In [46]: p
Out[46]:
array([[0, 1, 0],
[1, 1, 0]])
In [47]: np.var(p, axis=0, ddof=1)
Out[47]: array([ 0.5, 0. , 0. ])
When the code for the seuclidean divides by these variances, the result is either infinity or NaN--the latter if the numerator is also 0, which is the case in the third component of the input [X[0], X2[0]].
To work around this, you have to decide how you want to handle the case where the variance of a component is 0, and handle it explicitly. For example, if you want it to act like that variance is 1 in that case (just to avoid dividing by 0) you could do something like the following.
Suppose B is our array of points. The third column of B is all 1s.
In [63]: B
Out[63]:
array([[3, 0, 1],
[2, 1, 1],
[0, 0, 1],
[3, 1, 1],
[1, 0, 1]])
Compute the variances of the columns:
In [64]: V = np.var(B, axis=0, ddof=1)
In [65]: V
Out[65]: array([ 1.7, 0.3, 0. ])
Replace the variances that are 0 with 1:
In [66]: V[V == 0] = 1
In [67]: V
Out[67]: array([ 1.7, 0.3, 1. ])
Use V to compute the standardized Euclidean distances:
In [68]: pdist(B, metric='seuclidean', V=V)
Out[68]:
array([ 1.98029509, 2.30089497, 1.82574186, 1.53392998, 2.38459106,
0.76696499, 1.98029509, 2.93725228, 0.76696499, 2.38459106])
This has the same effect as simply removing the constant column:
In [69]: pdist(B[:, :2], metric='seuclidean')
Out[69]:
array([ 1.98029509, 2.30089497, 1.82574186, 1.53392998, 2.38459106,
0.76696499, 1.98029509, 2.93725228, 0.76696499, 2.38459106])
Your "Method 2" is wrong because your formula is wrong. You have to keep the variances for each component. np.var([X[0], X2[0]]) computes the (single) variance of all the values in the input. Instead, you need to use the axis and ddof arguments shown above.

numpy dot product returns NaN, Matlab equivalent does not return NaN

I have a vector of beta=np.array([[1],[4],[0]]) and when I use np.log with this vector, I get this:
>>> np.log(beta)
array([[ 0. ],
[ 1.38629436],
[ -inf]])
but when I use np.dot with this beta and an identity matrix it gives NaN instead of 1.38629436 as element at [1,0].
>>> np.dot(np.eye(3),np.log(beta))
array([[ nan],
[ nan],
[-inf]])
I tried also this one:
>>> beta2 = np.log(beta)
>>> beta2
array([[ 0. ],
[ 1.38629436],
[ -inf]])
>>> np.dot(np.eye(3),beta2)
array([[ nan],
[ nan],
[-inf]])
Matlab version of same multiplication does not return NaN. I would like to have the same in numpy. Any ideas?
Edit: I know basic linear algebra people thanks for that. My actual question was to manage to have a numpy equivalent for the dot product that does the same thing with the one with Matlab, which doesn't return NaN in the same case.
The 3rd component of the vector is involved in all the products with rows of the matrix. Infinity times zero is indeterminate. Python, like most languages, declares that to be not a number.
What's the first element of
[[1 0 0] [[0]
[0 1 0] * [1.38629436]
[0 0 1]] [-inf]]
? Well, it's 1*0 + 0*1.38629436 + 0*-inf. See that last part?
0*-inf
All those nice theorems of linear algebra go straight out the window if you start trying to put infinities in your matrices. Heck, those theorems only approximately hold with finite floating-point numbers, since floating-point numbers and floating-point arithmetic can only approximate real numbers and real arithmetic.
The dot involves multiplying 'all' values, and summing one axis. The equivalent to this dot is
np.einsum('ij,jk->ik`, np.eye(3), np.log(beta))
which can be evaluated with broadcasting as:
In [223]: np.eye(3)[:,:,None]*np.log(beta)[None,:,:]
Out[223]:
array([[[ 0. ],
[ 0. ],
[ nan]],
[[ 0. ],
[ 1.38629436],
[ nan]],
[[ 0. ],
[ 0. ],
[ -inf]]])
In [224]: (np.eye(3)[:,:,None]*np.log(beta)[None,:,:]).sum(axis=1)
Out[224]:
array([[ nan],
[ nan],
[-inf]])
So the first nan comes from summing [0,0,nan].
In [226]: 0*np.log(beta)
Out[226]:
array([[ 0.],
[ 0.],
[ nan]])

Weird behavior when squaring elements in numpy array

I have two numpy arrays of shape (1, 250000):
a = [[ 0 254 1 ..., 255 0 1]]
b = [[ 1 0 252 ..., 0 255 255]]
I want to create a new numpy array whose elements are the square root of the sum of squares of elements in the arrays a and b, but I am not getting the correct result:
>>> c = np.sqrt(np.square(a)+np.square(b))
>>> print c
[[ 1. 2. 4.12310553 ..., 1. 1. 1.41421354]]
Am I missing something simple here?
Presumably your arrays a and b are arrays of unsigned 8 bit integers--you can check by inspecting the attribute a.dtype. When you square them, the data type is preserved, and the 8 bit values overflow, which means the values "wrap around" (i.e. the squared values are modulo 256):
In [7]: a = np.array([[0, 254, 1, 255, 0, 1]], dtype=np.uint8)
In [8]: np.square(a)
Out[8]: array([[0, 4, 1, 1, 0, 1]], dtype=uint8)
In [9]: b = np.array([[1, 0, 252, 0, 255, 255]], dtype=np.uint8)
In [10]: np.square(a) + np.square(b)
Out[10]: array([[ 1, 4, 17, 1, 1, 2]], dtype=uint8)
In [11]: np.sqrt(np.square(a) + np.square(b))
Out[11]:
array([[ 1. , 2. , 4.12310553, 1. , 1. ,
1.41421354]], dtype=float32)
To avoid the problem, you can tell np.square to use a floating point data type:
In [15]: np.sqrt(np.square(a, dtype=np.float64) + np.square(b, dtype=np.float64))
Out[15]:
array([[ 1. , 254. , 252.00198412, 255. ,
255. , 255.00196078]])
You could also use the function numpy.hypot, but you might still want to use the dtype argument, otherwise the default data type is np.float16:
In [16]: np.hypot(a, b)
Out[16]: array([[ 1., 254., 252., 255., 255., 255.]], dtype=float16)
In [17]: np.hypot(a, b, dtype=np.float64)
Out[17]:
array([[ 1. , 254. , 252.00198412, 255. ,
255. , 255.00196078]])
You might wonder why the dtype argument that I used in numpy.square and numpy.hypot is not shown in the functions' docstrings. Both of these functions are numpy "ufuncs", and the authors of numpy decided that it was better to show only the main arguments in the docstring. The optional arguments are documented in the reference manual.
For this simple case, it works perfectly fine:
In [1]: a = np.array([[ 0, 2, 4, 6, 8]])
In [2]: b = np.array([[ 1, 3, 5, 7, 9]])
In [3]: c = np.sqrt(np.square(a) + np.square(b))
In [4]: print(c)
[[ 1. 3.60555128 6.40312424 9.21954446 12.04159458]]
You must be doing something wrong.

Difference between array and matrix numpy for solving linear equations

There are many questions already asked in the same grounds.
I also read the official documentation (http://www.scipy.org/scipylib/faq.html#what-is-the-difference-between-matrices-and-arrays) regarding the differences. But I am still struggling to understand the philosophical difference between numpy arrays and matrices.
More preciously I am seeking the reason for the below mention results.
#using array
>>> A = np.array([[ 1, -1, 2],
[ 0, 1, -1],
[ 0, 0, 1]])
>>> b = np.array([5,-1,3])
>>> x = np.linalg.solve(A,b)
>>> x
array([ 1., 2., 3.])
`#using matrix
>>> A=np.mat(A)
>>> b=np.mat(b)
>>> A
matrix([[ 1, -1, 2],
[ 0, 1, -1],
[ 0, 0, 1]])
>>> b
matrix([[ 5, -1, 3]])
>>> x = np.linalg.solve(A,b)
>>> x
matrix([[ 5., -1., 3.],
[ 10., -2., 6.],
[ 5., -1., 3.]])
Why the linear equations represented as array yields correct solution while the matrix representation yields another matrix solution.
Also honestly I don't understand the reason for getting matrix as a solution in the second case.
Sorry if the question is already answered and I failed to notice and also pardon me if my understanding of numpy array and matrix is wrong.
You have a transpose issue...when you go to matrix land, column-vectors and row-vectors are no longer interchangeable:
import numpy as np
A = np.array([[ 1, -1, 2],
[ 0, 1, -1],
[ 0, 0, 1]])
b = np.array([5,-1,3])
x = np.linalg.solve(A, b)
print 'arrays:'
print x
A = np.matrix(A)
b = np.matrix(b)
x = np.linalg.solve(A, b)
print 'matrix, wrong set up:'
print x
b = b.T
x = np.linalg.solve(A, b)
print 'matrix, right set up:'
print x
yields:
arrays:
[ 1. 2. 3.]
matrix, wrong set up:
[[ 5. -1. 3.]
[ 10. -2. 6.]
[ 5. -1. 3.]]
matrix, right set up:
[[ 1.]
[ 2.]
[ 3.]]

Scipy's fftpack dct and idct

Let say you use the dct function, then do no manipulation of the data and use the invert transform; wouldn't the inverted data be the same as the pre-transformed data? Why the floating point issue? Is it a reported issue or is it a normal behavior?
In [21]: a = [1.2, 3.4, 5.1, 2.3, 4.5]
In [22]: b = dct(a)
In [23]: b
Out[23]: array([ 33. , -4.98384545, -4.5 , -5.971707 , 4.5 ])
In [24]: c = idct(b)
In [25]: c
Out[25]: array([ 12., 34., 51., 23., 45.])
Anyone has an explanation as why? Of course, a simple c*10**-1 would do the trick, but if you repeat the call of the function to use it on several dimensions, the error gets bigger:
In [37]: a = np.random.rand(3,3,3)
In [38]: d = dct(dct(dct(a).transpose(0,2,1)).transpose(2,1,0)).transpose(2,1,0).transpose(0,2,1)
In [39]: e = idct(idct(idct(d).transpose(0,2,1)).transpose(2,1,0)).transpose(2,1,0).transpose(0,2,1)
In [40]: a
Out[40]:
array([[[ 0.48709809, 0.50624831, 0.91190972],
[ 0.56545798, 0.85695062, 0.62484782],
[ 0.96092354, 0.17453537, 0.17884233]],
[[ 0.29433402, 0.08540074, 0.18574437],
[ 0.09942075, 0.78902363, 0.62663572],
[ 0.20372951, 0.67039551, 0.52292875]],
[[ 0.79952289, 0.48221372, 0.43838685],
[ 0.25559683, 0.39549153, 0.84129493],
[ 0.69093533, 0.71522961, 0.16522915]]])
In [41]: e
Out[41]:
array([[[ 105.21318703, 109.34963575, 196.97249887],
[ 122.13892469, 185.10133376, 134.96712825],
[ 207.55948396, 37.69964085, 38.62994399]],
[[ 63.57614855, 18.44656009, 40.12078466],
[ 21.47488098, 170.42910452, 135.35331646],
[ 44.00557341, 144.80543099, 112.95260949]],
[[ 172.69694529, 104.15816275, 94.69156014],
[ 55.20891593, 85.42617016, 181.71970442],
[ 149.2420308 , 154.48959477, 35.68949734]]])
Here a link to the doc.
It looks like dct and idct do not normalize by default. define dct to call fftpack.dct in the following manner. Do the same for idct.
In [13]: dct = lambda x: fftpack.dct(x, norm='ortho')
In [14]: idct = lambda x: fftpack.idct(x, norm='ortho')
Once done, you will get back the original answers after performing the transforms.
In [19]: import numpy
In [20]: a = numpy.random.rand(3,3,3)
In [21]: d = dct(dct(dct(a).transpose(0,2,1)).transpose(2,1,0)).transpose(2,1,0).transpose(0,2,1)
In [22]: e = idct(idct(idct(d).transpose(0,2,1)).transpose(2,1,0)).transpose(2,1,0).transpose(0,2,1)
In [23]: a
Out[23]:
array([[[ 0.51699637, 0.42946223, 0.89843545],
[ 0.27853391, 0.8931508 , 0.34319118],
[ 0.51984431, 0.09217771, 0.78764716]],
[[ 0.25019845, 0.92622331, 0.06111409],
[ 0.81363641, 0.06093368, 0.13123373],
[ 0.47268657, 0.39635091, 0.77978269]],
[[ 0.86098829, 0.07901332, 0.82169182],
[ 0.12560088, 0.78210188, 0.69805434],
[ 0.33544628, 0.81540172, 0.9393219 ]]])
In [24]: e
Out[24]:
array([[[ 0.51699637, 0.42946223, 0.89843545],
[ 0.27853391, 0.8931508 , 0.34319118],
[ 0.51984431, 0.09217771, 0.78764716]],
[[ 0.25019845, 0.92622331, 0.06111409],
[ 0.81363641, 0.06093368, 0.13123373],
[ 0.47268657, 0.39635091, 0.77978269]],
[[ 0.86098829, 0.07901332, 0.82169182],
[ 0.12560088, 0.78210188, 0.69805434],
[ 0.33544628, 0.81540172, 0.9393219 ]]])
I am not sure why no normalization was chosen by default. But when using ortho, dct and idct each seem to normalize by a factor of 1/sqrt(2 * N) or 1/sqrt(4 * N). There may be applications where the normalization is needed for dct and not idct and vice versa.

Categories