I have a simple problem.
I have two matrices A and B,
and I want to find a transformation AX that makes AX closest to B in the least squares sense.
i.e. find X such that X = argmin ||AX -B|| under 2-norm.
How can I solve this problem using numpy or scipy?
I tried to search but the only method that is there is lstsq (https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.lstsq.html) but it finds a linear equation solution involving vector x.
How to solve my problem, i.e. find a transformation X for matrix A?
Are there any non-linear methods that do this? How is this problem solved in optimization?
Related
Suppose I have a symmetric matrix A and a vector b and want to find A^(-1) b. Now, this is well-known to be doable in time O(N^2) (where N is the dimension of the vector\matrix), and I believe that in MATLAB this can be done as b\A. But all I can find in python is numpy.linalg.solve() which will do Gaussian elimination, which is O(N^3). I must not be looking in the right place...
scipy.linalg.solve has an argument to make it assume a symmetric matrix:
x = scipy.linalg.solve(A, b, assume_a="sym")
If you know your matrix is not just symmetric but positive definite you can give this stronger assumption instead, as "pos".
I'm trying to implement the idea I have suggested here, for Cauchy product of multivariate finite power series (i.e. polynomials) represented as NumPy ndarrays. numpy.convolve does the job for 1D arrays, respectively. But to my best knowledge there is no implementations of convolution for arbitrary dimensional arrays. In the above link, I have suggested the equation:
for convolution of two n dimensional arrays Phi of shape P=[p1,...,pn] and Psi of the shape Q=[q1,...,qn], where:
omegas are the elements of n dimensional array Omega of the shape O=P+Q-1
<A,B>_F is the generalization of Frobenius inner product for arbitrary dimensional arrays A and B of the same shape
A^F is A flipped in all n directions
{A}_[k1,...,kn] is a slice of A starting from [0,...,0] to [k1,...,kn]
Psi' is Psi extended with zeros to have the shape O as defined above
I tried implementing the above functions one by one:
import numpy as np
def crop(A,D1,D2):
return A[tuple(slice(D1[i], D2[i]) for i in range(D1.shape[0]))]
as was suggested here, slices/crops A from D1 to D2,
def sumall(A):
sum1=A
for k in range(A.ndim):
sum1 = np.sum(sum1,axis=0)
return sum1
is a generalization of numpy.sum for multidimensional ndarrays,
def flipall(A):
A1=A
for k in range(A.ndim):
A1=np.flip(A1,k)
return A1
flips A is all existing axises, and finally
def conv(A,B,K):
D0=np.zeros(K.shape,dtype=K.dtype)
return sumall(np.multiply(crop(A,np.maximum(D0,np.minimum(A.shape,K-B.shape)) \
,np.minimum(A.shape,K)), \
flipall(crop(B,np.maximum(D0,np.minimum(B.shape,K-A.shape)) \
,np.minimum(B.shape,K)))))
where K=[k1,...,kn] and for all 0<=kj<=oj, is a modified version of formula above which only calculate the non-zero multiplications to be more efficient. Now I'm trying to populate the Omega array using fromfunction or meshgrid in combination to vectorize as suggested here, but I have failed so far. Now my questions in prioritized order are:
how can I implement the final step and populate the final array in an efficient and pythonic way?
are there more efficient implementations of the functions above? Or how would you implement the formula?
is my equation correct? does this represent multiplication of multivariate finite power series?
haven't really others implemented this before in NumPy or am I reinventing the wheel here? I would appreciate if you could point me towards other solutions.
I would appreciate if you could help me with these questions. Thanks for your help in advance.
P.S.1 You may find some examples and other information in this GitHub Gist
P.S.2 Here in the AstroPy mailing list I was told that scipy.signal.convolve and/or scipy.ndimage.convolve do the job for higher dimensions as well. There is also a scipy.ndimage.filters.convolve. Here I have explained why they are not what I'm looking for.
I need to solve a set of simultaneous equations of the form Ax = B for x. I've used the numpy.linalg.solve function, inputting A and B, but I get the error 'LinAlgError: Last 2 dimensions of the array must be square'. How do I fix this?
Here's my code:
A = matrix([[v1x, v2x], [v1y, v2y], [v1z, v2z]])
print A
B = [(p2x-p1x-nmag[0]), (p2y-p1y-nmag[1]), (p2z-p1z-nmag[2])]
print B
x = numpy.linalg.solve(A, B)
The values of the matrix/vector are calculated earlier in the code and this works fine, but the values are:
A =
(-0.56666301, -0.52472909)
(0.44034147, 0.46768087)
(0.69641397, 0.71129036)
B =
(-0.38038602567630364, -24.092279373295057, 0.0)
x should have the form (x1,x2,0)
In case you still haven't found an answer, or in case someone in the future has this question.
To solve Ax=b:
numpy.linalg.solve uses LAPACK gesv. As mentioned in the documentation of LAPACK, gesv requires A to be square:
LA_GESV computes the solution to a real or complex linear system of equations AX = B, where A is a square matrix and X and B are rectangular matrices or vectors. Gaussian elimination with row interchanges is used to factor A as A = PL*U , where P is a permutation matrix, L is unit lower triangular, and U is upper triangular. The factored form of A is then used to solve the above system.
If A matrix is not square, it means that you either have more variables than your equations or the other way around. In these situations, you can have the cases of no solution or infinite number of solutions. What determines the solution space is the rank of the matrix compared to the number of columns. Therefore, you first have to check the rank of the matrix.
That being said, you can use another method to solve your system of linear equations. I suggest having a look at factorization methods like LU or QR or even SVD. In LAPACK you can use getrs, in Python you can different things:
first do the factorization like QR and then feed the resulting matrices to a method like scipy.linalg.solve_triangular
solve the least-squares using numpy.linalg.lstsq
Also have a look here where a simple example is formulated and solved.
A square matrix is a matrix with the same number of rows and columns. The matrix you are doing is a 3 by 2. Add a column of zeroes to fix this problem.
I calculate my covariance with following formula:
np.dot(X_zero_mean, X_zero_mean.T) / (X_zero_mean.shape[0] -1)
and compare it to
np.cov(X_zero_mean.T)
I both print the resulting matrices to console and create a figure out of them, but they are not the same. Why? Could it be that cov avoids some numerical error, that is happening with my above formula?
First one is my covariance, second one is the numpy cov:
Without your exact matrices it's difficult to tell, but I would guess that it's because you're taking the transpose of the matrix before passing it to np.cov. That would also explain why numpy's result looks like it's of much higher dimension than yours. np.cov(X.T) is equivalent to np.dot(X.T, X), not np.dot(X, X.T).
How do I get the inverse of a matrix in python? I've implemented it myself, but it's pure python, and I suspect there are faster modules out there to do it.
You should have a look at numpy if you do matrix manipulation. This is a module mainly written in C, which will be much faster than programming in pure python. Here is an example of how to invert a matrix, and do other matrix manipulation.
from numpy import matrix
from numpy import linalg
A = matrix( [[1,2,3],[11,12,13],[21,22,23]]) # Creates a matrix.
x = matrix( [[1],[2],[3]] ) # Creates a matrix (like a column vector).
y = matrix( [[1,2,3]] ) # Creates a matrix (like a row vector).
print A.T # Transpose of A.
print A*x # Matrix multiplication of A and x.
print A.I # Inverse of A.
print linalg.solve(A, x) # Solve the linear equation system.
You can also have a look at the array module, which is a much more efficient implementation of lists when you have to deal with only one data type.
Make sure you really need to invert the matrix. This is often unnecessary and can be numerically unstable. When most people ask how to invert a matrix, they really want to know how to solve Ax = b where A is a matrix and x and b are vectors. It's more efficient and more accurate to use code that solves the equation Ax = b for x directly than to calculate A inverse then multiply the inverse by B. Even if you need to solve Ax = b for many b values, it's not a good idea to invert A. If you have to solve the system for multiple b values, save the Cholesky factorization of A, but don't invert it.
See Don't invert that matrix.
It is a pity that the chosen matrix, repeated here again, is either singular or badly conditioned:
A = matrix( [[1,2,3],[11,12,13],[21,22,23]])
By definition, the inverse of A when multiplied by the matrix A itself must give a unit matrix. The A chosen in the much praised explanation does not do that. In fact just looking at the inverse gives a clue that the inversion did not work correctly. Look at the magnitude of the individual terms - they are very, very big compared with the terms of the original A matrix...
It is remarkable that the humans when picking an example of a matrix so often manage to pick a singular matrix!
I did have a problem with the solution, so looked into it further. On the ubuntu-kubuntu platform, the debian package numpy does not have the matrix and the linalg sub-packages, so in addition to import of numpy, scipy needs to be imported also.
If the diagonal terms of A are multiplied by a large enough factor, say 2, the matrix will most likely cease to be singular or near singular. So
A = matrix( [[2,2,3],[11,24,13],[21,22,46]])
becomes neither singular nor nearly singular and the example gives meaningful results... When dealing with floating numbers one must be watchful for the effects of inavoidable round off errors.
For those like me, who were looking for a pure Python solution without pandas or numpy involved, check out the following GitHub project: https://github.com/ThomIves/MatrixInverse.
It generously provides a very good explanation of how the process looks like "behind the scenes". The author has nicely described the step-by-step approach and presented some practical examples, all easy to follow.
This is just a little code snippet from there to illustrate the approach very briefly (AM is the source matrix, IM is the identity matrix of the same size):
def invert_matrix(AM, IM):
for fd in range(len(AM)):
fdScaler = 1.0 / AM[fd][fd]
for j in range(len(AM)):
AM[fd][j] *= fdScaler
IM[fd][j] *= fdScaler
for i in list(range(len(AM)))[0:fd] + list(range(len(AM)))[fd+1:]:
crScaler = AM[i][fd]
for j in range(len(AM)):
AM[i][j] = AM[i][j] - crScaler * AM[fd][j]
IM[i][j] = IM[i][j] - crScaler * IM[fd][j]
return IM
But please do follow the entire thing, you'll learn a lot more than just copy-pasting this code! There's a Jupyter notebook as well, btw.
Hope that helps someone, I personally found it extremely useful for my very particular task (Absorbing Markov Chain) where I wasn't able to use any non-standard packages.
You could calculate the determinant of the matrix which is recursive
and then form the adjoined matrix
Here is a short tutorial
I think this only works for square matrices
Another way of computing these involves gram-schmidt orthogonalization and then transposing the matrix, the transpose of an orthogonalized matrix is its inverse!
Numpy will be suitable for most people, but you can also do matrices in Sympy
Try running these commands at http://live.sympy.org/
M = Matrix([[1, 3], [-2, 3]])
M
M**-1
For fun, try M**(1/2)
If you hate numpy, get out RPy and your local copy of R, and use it instead.
(I would also echo to make you you really need to invert the matrix. In R, for example, linalg.solve and the solve() function don't actually do a full inversion, since it is unnecessary.)