Considering the following matrix equation:
x=Ab
where:
In[1]:A
Out[1]:
matrix([[ 0.477, -0.277, -0.2 ],
[-0.277, 0.444, -0.167],
[-0.2 , -0.167, 0.367]])
In[2]: b
Out[2]: [0, 60, 40]
how come that when I use numpy.linalg() I get the following results?
import numpy as np
x = np.linalg.solve(A, b)
res=x.tolist()
# res=[1.8014398509481981e+18, 1.801439850948198e+18, 1.8014398509481984e+18]
These numbers are huge! What's wrong here? I am suspecting A is in the wrong form, as it multiplies b in my equation, whereas numpy.linalg() considers A as if it multiplies x.
What you give as an equation (x=A b) is just a matrix multiplication rather than a set of linear equations to solve (A x=b) for which you would use np.linalg.solve. What you need to do to get x in your case is simply use np.dot (A.dot(b)).
Your matrix is singular, as can be seen by adding its columns which sum to zero. Mathematically, this system is only solvable for a very small set of b vectors.
The solution you're getting is most likely just numerical noise.
Related
I need to solve a set of simultaneous equations of the form Ax = B for x. I've used the numpy.linalg.solve function, inputting A and B, but I get the error 'LinAlgError: Last 2 dimensions of the array must be square'. How do I fix this?
Here's my code:
A = matrix([[v1x, v2x], [v1y, v2y], [v1z, v2z]])
print A
B = [(p2x-p1x-nmag[0]), (p2y-p1y-nmag[1]), (p2z-p1z-nmag[2])]
print B
x = numpy.linalg.solve(A, B)
The values of the matrix/vector are calculated earlier in the code and this works fine, but the values are:
A =
(-0.56666301, -0.52472909)
(0.44034147, 0.46768087)
(0.69641397, 0.71129036)
B =
(-0.38038602567630364, -24.092279373295057, 0.0)
x should have the form (x1,x2,0)
In case you still haven't found an answer, or in case someone in the future has this question.
To solve Ax=b:
numpy.linalg.solve uses LAPACK gesv. As mentioned in the documentation of LAPACK, gesv requires A to be square:
LA_GESV computes the solution to a real or complex linear system of equations AX = B, where A is a square matrix and X and B are rectangular matrices or vectors. Gaussian elimination with row interchanges is used to factor A as A = PL*U , where P is a permutation matrix, L is unit lower triangular, and U is upper triangular. The factored form of A is then used to solve the above system.
If A matrix is not square, it means that you either have more variables than your equations or the other way around. In these situations, you can have the cases of no solution or infinite number of solutions. What determines the solution space is the rank of the matrix compared to the number of columns. Therefore, you first have to check the rank of the matrix.
That being said, you can use another method to solve your system of linear equations. I suggest having a look at factorization methods like LU or QR or even SVD. In LAPACK you can use getrs, in Python you can different things:
first do the factorization like QR and then feed the resulting matrices to a method like scipy.linalg.solve_triangular
solve the least-squares using numpy.linalg.lstsq
Also have a look here where a simple example is formulated and solved.
A square matrix is a matrix with the same number of rows and columns. The matrix you are doing is a 3 by 2. Add a column of zeroes to fix this problem.
[python 2.7 and numpy v1.11.1] I am looking at matrix condition numbers and am trying to compute the condition number for a matrix without using the function np.linalg.cond().
Based on numpy's documentation, the definition of a matrix's condition number is, "the norm of x times the norm of the inverse of x."
||X|| * ||X^-1||
for the matrix
a = np.matrix([[1, 1, 1],
[2, 2, 1],
[3, 3, 0]])
print np.linalg.cond(a)
1.84814479698e+16
print np.linalg.norm(a) * np.linalg.norm(np.linalg.inv(a))
2.027453660713377e+17
Where is the mistake in my computation?
Thanks!
You are trying to compute the condition using the Frobenius Norm definition. That is an optional parameter to the condition computation.
print(np.linalg.norm(a)*np.linalg.norm(np.linalg.inv(a)))
print(np.linalg.cond(a, p='fro'))
Produces
2.02745366071e+17
2.02745366071e+17
norm uses the Frobenius norm for matrix by default,when cond uses 2-norm:
In [347]: np.linalg.cond(a)
Out[347]: 38.198730775206172
In [348]:np.linalg.norm(a,2)*np.linalg.norm(np.linalg.inv(a),2)
Out[348]: 38.198730775206243
In [349]: np.linalg.norm(a)*np.linalg.norm(np.linalg.inv(a))
Out[349]: 39.29814570824248
NumPy cond() is currently buggy. There will come a time when we will fix it but for now if you are doing this for linear equation solutions you can use SciPy linalg.solve which will either produce an error for exact singularity or a warning if reciprocal condition number is below threshold and nothing if the array is invertible.
Using example from Andrew Ng's class (finding parameters for Linear Regression using normal equation):
With Python:
X = np.array([[1, 2104, 5, 1, 45], [1, 1416, 3, 2, 40], [1, 1534, 3, 2, 30], [1, 852, 2, 1, 36]])
y = np.array([[460], [232], [315], [178]])
θ = ((np.linalg.inv(X.T.dot(X))).dot(X.T)).dot(y)
print(θ)
Result:
[[ 7.49398438e+02]
[ 1.65405273e-01]
[ -4.68750000e+00]
[ -4.79453125e+01]
[ -5.34570312e+00]]
With Julia:
X = [1 2104 5 1 45; 1 1416 3 2 40; 1 1534 3 2 30; 1 852 2 1 36]
y = [460; 232; 315; 178]
θ = ((X' * X)^-1) * X' * y
Result:
5-element Array{Float64,1}:
207.867
0.0693359
134.906
-77.0156
-7.81836
Furthermore, when I multiple X by Julia's — but not Python's — θ, I get numbers close to y.
I can't figure out what I am doing wrong. Thanks!
Using X^-1 vs the pseudo inverse
pinv(X) which corresponds to the pseudo inverse is more broadly applicable than inv(X), which X^-1 equates to. Neither Julia nor Python do well using inv, but in this case apparently Julia does better.
but if you change the expression to
julia> z=pinv(X'*X)*X'*y
5-element Array{Float64,1}:
188.4
0.386625
-56.1382
-92.9673
-3.73782
you can verify that X*z = y
julia> X*z
4-element Array{Float64,1}:
460.0
232.0
315.0
178.0
A more numerically robust approach in Python, without having to do the matrix algebra yourself is to use numpy.linalg.lstsq to do the regression:
In [29]: np.linalg.lstsq(X, y)
Out[29]:
(array([[ 188.40031942],
[ 0.3866255 ],
[ -56.13824955],
[ -92.9672536 ],
[ -3.73781915]]),
array([], dtype=float64),
4,
array([ 3.08487554e+03, 1.88409728e+01, 1.37100414e+00,
1.97618336e-01]))
(Compare the solution vector with #waTeim's answer in Julia).
You can see the source of the ill-conditioning by printing the matrix inverse you're calculating:
In [30]: np.linalg.inv(X.T.dot(X))
Out[30]:
array([[ -4.12181049e+13, 1.93633440e+11, -8.76643127e+13,
-3.06844458e+13, 2.28487459e+12],
[ 1.93633440e+11, -9.09646601e+08, 4.11827338e+11,
1.44148665e+11, -1.07338299e+10],
[ -8.76643127e+13, 4.11827338e+11, -1.86447963e+14,
-6.52609055e+13, 4.85956259e+12],
[ -3.06844458e+13, 1.44148665e+11, -6.52609055e+13,
-2.28427584e+13, 1.70095424e+12],
[ 2.28487459e+12, -1.07338299e+10, 4.85956259e+12,
1.70095424e+12, -1.26659193e+11]])
Eeep!
Taking the dot product of this with X.T leads to a catastrophic loss of precision.
Notice that X is a 4x5 matrix or in statistical terms that you have fewer observations than parameters to estimate. Therefore, the least squares problem has infinitely many solutions with the sum of the squared errors exactly equal to zero. In this case, the normal equations don't help you much because the matrix X'X is singular. Instead, you should just find a solution to X*b=y.
Most numerical linear algebra systems are based on the FORTRAN package LAPACK which uses the a pivoted QR factorization for solving the problem X*b=y. Since there are infinitely many solutions, LAPACK's picks the solution with the smallest norm. In Julia, you can get this solution, simply by writing
float(X)\y
(Unfortunately, the float part is necessary right now, but that will change.)
In exact arithmetic, you should get the same solution as the one above with either of your proposed methods, but the floating point representation of you problem introduces small rounding errors and these errors will affect the calculated solution. The effect of the rounding errors on the solution is much larger when using the normal equations compared to using the QR factorization directly on X.
This holds true also in the usual case where X has more rows than columns so often it is recommended that you avoid the normal equations when solving least squares problems. However, when X has many more rows than columns, the matrix X'X is relatively small. In this case, it will be much faster to solve the problem with the normal equations instead of using the QR factorization. In many statistical problems, the extra numerical error is extremely small compared to the statical error so the loss of precision due to the normal equations can simply be ignored.
I am looking to solve a problem of the type: Aw = xBw where x is a scalar (eigenvalue), w is an eigenvector, and A and B are symmetric, square numpy matrices of equal dimension. I should be able to find d x/w pairs if A and B are d x d. How would I solve this in numpy? I was looking in the Scipy docs and not finding anything like what I wanted.
For real symmetric or complex Hermitian dense matrices, you can use scipy.linalg.eigh() to solve a generalized eigenvalue problem. To avoid extracting all the eigenvalues you can specify only the desired ones by using subset_by_index:
from scipy.linalg import eigh
eigvals, eigvecs = eigh(A, B, eigvals_only=False, subset_by_index=[0, 1, 2])
One could use eigvals_only=True to obtain only the eigenvalues.
Have you seen scipy.linalg.eig? From the documentation:
Solve an ordinary or generalized eigenvalue problem of a square matrix.
This method have optional parameter b:
scipy.linalg.eig(a, b=None, ...
b : (M, M) array_like, optional
Right-hand side matrix in a generalized eigenvalue problem.
Default is None, identity matrix is assumed.
I have the following matrices sigma and sigmad:
sigma:
1.9958 0.7250
0.7250 1.3167
sigmad:
4.8889 1.1944
1.1944 4.2361
If I try to solve the generalized eigenvalue problem in python I obtain:
d,V = sc.linalg.eig(matrix(sigmad),matrix(sigma))
V:
-1 -0.5614
-0.4352 1
If I try to solve the g. e. problem in matlab I obtain:
[V,d]=eig(sigmad,sigma)
V:
-0.5897 -0.5278
-0.2564 0.9400
But the d's do coincide.
Any (nonzero) scalar multiple of an eigenvector will also be an eigenvector; only the direction is meaningful, not the overall normalization. Different routines use different conventions -- often you'll see the magnitude set to 1, or the maximum value set to 1 or -1 -- and some routines don't even bother being internally consistent for performance reasons. Your two different results are multiples of each other:
In [227]: sc = array([[-1., -0.5614], [-0.4352, 1. ]])
In [228]: ml = array([[-.5897, -0.5278], [-0.2564, 0.94]])
In [229]: sc/ml
Out[229]:
array([[ 1.69577751, 1.06366048],
[ 1.69734789, 1.06382979]])
and so they're actually the same eigenvectors. Think of the matrix as an operator which changes a vector: the eigenvectors are the special directions where a vector pointing that way won't be twisted by the matrix, and the eigenvalues are the factors measuring how much the matrix expands or contracts the vector.