Inverting large arrays in Python

Inverting large arrays in Python - python

I have two arrays R3_mod with shape (21,21) containing many zeros and P2 with shape (21,) containing many zeros. I am getting the inverse of R3_mod using np.linalg.pinv() and eventually multiplying it to P2 as shown below. Is there a more efficient way to invert such arrays and then multiply?
Since the arrays are too big, you can access it here: https://drive.google.com/drive/u/0/folders/1NjEiNoneMaCbmbmObEs2GCNIb08NFIy3
import numpy as np
X = np.linalg.pinv(R3_mod).dot(P2)

Assuming that the matrix R3_mod is indeed invertible, I think it's best to use np.linalg.inv instead of linalg.pinv.
inv computes the inverse of the matrix directly, where pinv (stands for pseudo-inverse, see https://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_inverse) computes the matrix A' that minimizes |AA'-I|. If the input matrix is invertible, pinv should return the same result as inv.

Related

Is numpy matrix multiplication same as Linear Algebra matrix multiplication?

As we know that In Linear Algebra it is mandatory to multiply a vector by matrix or multiply two matrices, the number of rows of one matrix or vector must be equal to the number of columns in other vector or matrix.
while i was working in numpy python and it is giving me a different result.
Here is my code and it works.
np.array([1,2]) * np.array([[1],[2],[3]])
so is there any difference between numpy vector to matrix
matlication vs linear algebra vector to matrix multiplication.

use numpy np.dot(a,b)
Use the following code and you will get error you want.
np.dot(np.array([1,2]) , np.array([[1],[2],[3]]))
Becuase *,+,-,/ works element-wise on arrays.
If either a or b is 0-D (scalar), it is equivalent to multiply and
using numpy.multiply(a, b) or a * b is preferred.

Why multiplication functions of scipy sparse and numpy arrays give different results?

I have two matrices in Python 2.7: one dense A_dense and the another sparse matrix A_sparse. I am interested in computing element-wise multiplication followed by sum. There are two ways to do it: use numpy's multiplication or scipy sparse multiplication. I expect them to give exactly same result with difference in execution time. But I find that they give different results for certain matrix sizes.
import numpy as np
from scipy import sparse
L=2000
np.random.seed(2)
rand_x=np.random.rand(L)
A_sparse_init=np.diag(rand_x, -1)+np.diag(rand_x, 1)
A_sparse=sparse.csr_matrix(A_sparse_init)
A_dense=np.random.rand(L+1,L+1)
print np.sum(A_sparse.multiply(A_dense))-np.sum(np.multiply(A_dense[A_sparse.nonzero()], A_sparse.data))
Output:
1.1368683772161603e-13
If I choose L=2001, then output is:
0.0
To check the size dependence of the difference using two different multiplication method, I wrote:
L=100
np.random.seed(2)
N_loop=100
multiply_diff_arr=np.zeros(N_loop)
for i in xrange(N_loop):
rand_x=np.random.rand(L)
A_sparse_init=np.diag(rand_x, -1)+np.diag(rand_x, 1)
A_sparse=sparse.csr_matrix(A_sparse_init)
A_dense=np.random.rand(L+1,L+1)
multiply_diff_arr[i]=np.sum(A_sparse.multiply(A_dense))-np.sum(np.multiply(A_dense[A_sparse.nonzero()], A_sparse.data))
L+=1
I got the following plot:
Can anyone help me understand what's happening? Don't we expect the difference between two methods to be at least 1e-18 rather than 1e-13?

I don't have a full answer, but this might help find the answer:
Under the hood, scipy.sparse will convert to coo format and do this:
ret = self.tocoo()
if self.shape == other.shape:
data = np.multiply(ret.data, other[ret.row, ret.col])
The question is then why these two operations give different results:
ret = A_sparse.tocoo()
c = np.multiply(ret.data, A_dense[ret.row, ret.col])
ret.data = c.view(type=np.ndarray)
c.sum() - ret.sum()
-1.1368683772161603e-13
Edit:
The difference stems from different defaults on which axis to add.reduce first.
E.g.:
A_sparse.multiply(A_dense).sum(axis=1).sum()
A_sparse.multiply(A_dense).sum(axis=0).sum()
Numpy defaults to 0 first.

Numpy array and matrix multiplication

I am trying to get rid of the for loop and instead do an array-matrix multiplication to decrease the processing time when the weights array is very large:
import numpy as np
sequence = [np.random.random(10), np.random.random(10), np.random.random(10)]
weights = np.array([[0.1,0.3,0.6],[0.5,0.2,0.3],[0.1,0.8,0.1]])
Cov_matrix = np.matrix(np.cov(sequence))
results = []
for w in weights:
result = np.matrix(w)*Cov_matrix*np.matrix(w).T
results.append(result.A)
Where:
Cov_matrix is a 3x3 matrix
weights is an array of n lenght with n 1x3 matrices in it.
Is there a way to multiply/map weights to Cov_matrix and bypass the for loop? I am not very familiar with all the numpy functions.

I'd like to reiterate what's already been said in another answer: the np.matrix class has much more disadvantages than advantages these days, and I suggest moving to the use of the np.array class alone. Matrix multiplication of arrays can be easily written using the # operator, so the notation is in most cases as elegant as for the matrix class (and arrays don't have several restrictions that matrices do).
With that out of the way, what you need can be done in terms of a call to np.einsum. We need to contract certain indices of three matrices while keeping one index alone in two matrices. That is, we want to perform w_{ij} * Cov_{jk} * w.T_{ki} with a summation over j, k, giving us an array with i indices. The following call to einsum will do:
res = np.einsum('ij,jk,ik->i', weights, Cov_matrix, weights)
Note that the above will give you a single 1d array, whereas you originally had a list of arrays with shape (1,1). I suspect the above result will even make more sense. Also, note that I omitted the transpose in the second weights argument, and this is why the corresponding summation indices appear as ik rather than ki. This should be marginally faster.
To prove that the above gives the same result:
In [8]: results # original
Out[8]: [array([[0.02803215]]), array([[0.02280609]]), array([[0.0318784]])]
In [9]: res # einsum
Out[9]: array([0.02803215, 0.02280609, 0.0318784 ])

The same can be achieved by working with the weights as a matrix and then looking at the diagonal elements of the result. Namely:
np.diag(weights.dot(Cov_matrix).dot(weights.transpose()))
which gives:
array([0.03553664, 0.02394509, 0.03765553])
This does more calculations than necessary (calculates off-diagonals) so maybe someone will suggest a more efficient method.
Note: I'd suggest slowly moving away from np.matrix and instead work with np.array. It takes a bit of getting used to not being able to do A*b but will pay dividends in the long run. Here is a related discussion.

Matrix inverse in numpy/python not giving correct matrix?

I have a nxn matrix C and use inv from numpy.linalg to take the inverse to get Cinverse. My Cmatrix has elements of order 10**4 but my Cinverse matrix has elements of order 10**12 and higher (not sure if thats correct). When I do numpyp.dot(C,Cinverse), I do not get the identity matrix. Why is this?
I have a vector x which I multiply by itself to get a matrix.
x=array([ 121.41191662, 74.22830468, 73.23156336, 75.48354975,
79.89580817])
c=np.outer(xvector,xvector)
this is a 5x5 matrix.
then I get its inverse by
from numpy.linalg import inv
cinverse=inv(c)
then I want to see if I can get identity matrix back.
identity=np.dot(C00,C00inv)
However, I do not get the identity matrix. cinverse has very large matrix elements
around 10**13 and higher while c has matrix elements around 10,000.

The outer product of two vectors (be they the same or not) is not invertible. Since it is just a stack of scaled copies of the same vector its rank is one. Rank defective matrices cannot be inverted.
I'm surprised that numpy is not raising an exception or at least giving a warning.

So here is some code that generates the inverse matrix, and I will comment about it afterwards.
import numpy as np
x = np.random.rand(5,5)*10000 # makes a 5x5 matrix with elements around 10000
xin = np.linalg.inv(x)
iden = np.dot(x,xinv)
Now the first line of your iden matrix probably looks something like this:
[ 1.00000000e+00, -2.05382445e-16, -5.61067365e-16, 1.99719718e-15, -2.12322957e-16]
. Notice that the first element is exactly 1, as it should be, but there others are not exactly 0, however they are essentially zero and should be regarded as zero according to machine precision.

Matlab to Python numpy indexing and multiplication issue

I have the following line of code in MATLAB which I am trying to convert to Python numpy:
pred = traindata(:,2:257)*beta;
In Python, I have:
pred = traindata[ : , 1:257]*beta
beta is a 256 x 1 array.
In MATLAB,
size(pred) = 1389 x 1
But in Python,
pred.shape = (1389L, 256L)
So, I found out that multiplying by the beta array is producing the difference between the two arrays.
How do I write the original Python line, so that the size of pred is 1389 x 1, like it is in MATLAB when I multiply by my beta array?

I suspect that beta is in fact a 1D numpy array. In numpy, 1D arrays are not row or column vectors where MATLAB clearly makes this distinction. These are simply 1D arrays agnostic of any shape. If you must, you need to manually introduce a new singleton dimension to the beta vector to facilitate the multiplication. On top of this, the * operator actually performs element-wise multiplication. To perform matrix-vector or matrix-matrix multiplication, you must use numpy's dot function to do so.
Therefore, you must do something like this:
import numpy as np # Just in case
pred = np.dot(traindata[:, 1:257], beta[:,None])
beta[:,None] will create a 2D numpy array where the elements from the 1D array are populated along the rows, effectively making a column vector (i.e. 256 x 1). However, if you have already done this on beta, then you don't need to introduce the new singleton dimension. Just use dot normally:
pred = np.dot(traindata[:, 1:257], beta)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Inverting large arrays in Python - python

Related

Is numpy matrix multiplication same as Linear Algebra matrix multiplication?

Why multiplication functions of scipy sparse and numpy arrays give different results?

Numpy array and matrix multiplication

Matrix inverse in numpy/python not giving correct matrix?

Matlab to Python numpy indexing and multiplication issue

Categories

Resources