Difference between array and matrix numpy for solving linear equations

Difference between array and matrix numpy for solving linear equations - python

There are many questions already asked in the same grounds.
I also read the official documentation (http://www.scipy.org/scipylib/faq.html#what-is-the-difference-between-matrices-and-arrays) regarding the differences. But I am still struggling to understand the philosophical difference between numpy arrays and matrices.
More preciously I am seeking the reason for the below mention results.
#using array
>>> A = np.array([[ 1, -1, 2],
[ 0, 1, -1],
[ 0, 0, 1]])
>>> b = np.array([5,-1,3])
>>> x = np.linalg.solve(A,b)
>>> x
array([ 1., 2., 3.])
`#using matrix
>>> A=np.mat(A)
>>> b=np.mat(b)
>>> A
matrix([[ 1, -1, 2],
[ 0, 1, -1],
[ 0, 0, 1]])
>>> b
matrix([[ 5, -1, 3]])
>>> x = np.linalg.solve(A,b)
>>> x
matrix([[ 5., -1., 3.],
[ 10., -2., 6.],
[ 5., -1., 3.]])
Why the linear equations represented as array yields correct solution while the matrix representation yields another matrix solution.
Also honestly I don't understand the reason for getting matrix as a solution in the second case.
Sorry if the question is already answered and I failed to notice and also pardon me if my understanding of numpy array and matrix is wrong.

You have a transpose issue...when you go to matrix land, column-vectors and row-vectors are no longer interchangeable:
import numpy as np
A = np.array([[ 1, -1, 2],
[ 0, 1, -1],
[ 0, 0, 1]])
b = np.array([5,-1,3])
x = np.linalg.solve(A, b)
print 'arrays:'
print x
A = np.matrix(A)
b = np.matrix(b)
x = np.linalg.solve(A, b)
print 'matrix, wrong set up:'
print x
b = b.T
x = np.linalg.solve(A, b)
print 'matrix, right set up:'
print x
yields:
arrays:
[ 1. 2. 3.]
matrix, wrong set up:
[[ 5. -1. 3.]
[ 10. -2. 6.]
[ 5. -1. 3.]]
matrix, right set up:
[[ 1.]
[ 2.]
[ 3.]]

Related

Reversing output matrix values on Numpy! Is there a specific command for this?

I am receiving the right answer when I compute the Vandermonde
coefficients of this matrix. However, the output matrix is reversed.
It should be [6,-39,55,27] instead of [27,55,-39,6].
My output for my Vandermonde Matrix is flipped and the final solution
c, is flipped.
import numpy as np
from numpy import linalg as LA
x = np.array([[4],[2],[0],[-1]])
f = np.array([[7],[29],[27],[-73]])
def main():
A_matrix = VandermondeMatrix(x)
print(A_matrix)
c = LA.solve(A_matrix,f) #coefficients of Vandermonde Polynomial
print(c)
def VandermondeMatrix(x):
n = len(x)
A = np.zeros((n, n))
exponent = np.array(range(0,n))
for j in range(n):
A[j, :] = x[j]**exponent
return A
if __name__ == "__main__":
main()

Just make the exponent range the other way around from the beginning, then you don't have to flip afterwards reducing runtime:
def VandermondeMatrix(x):
n = len(x)
A = np.zeros((n, n))
exponent = np.array(range(n-1,-1,-1))
for j in range(n):
A[j, :] = x[j]**exponent
return A
Out:
#A_matrix:
[[64. 16. 4. 1.]
[ 8. 4. 2. 1.]
[ 0. 0. 0. 1.]
[-1. 1. -1. 1.]]
#c:
[[ 6.]
[-39.]
[ 55.]
[ 27.]]

np.flip(c)?
link to documentation

You could do
print(c[::-1])
which will reverse the order of c.
From How can I flip the order of a 1d numpy array?

There is a parameter that does exactly that: increasing=True
Example from the documentation:
x = np.array([1, 2, 3, 5])
np.vander(x)
array([[ 1, 1, 1, 1],
[ 8, 4, 2, 1],
[ 27, 9, 3, 1],
[125, 25, 5, 1]])
np.vander(x, increasing=True)
array([[ 1, 1, 1, 1],
[ 1, 2, 4, 8],
[ 1, 3, 9, 27],
[ 1, 5, 25, 125]])

In [3]: def VandermondeMatrix(x):
...: n = len(x)
...: A = np.zeros((n, n))
...: exponent = np.array(range(0,n))
...: for j in range(n):
...: A[j, :] = x[j]**exponent
...: return A
...:
In [4]: x = np.array([[4],[2],[0],[-1]])
In [5]: VandermondeMatrix(x)
Out[5]:
array([[ 1., 4., 16., 64.],
[ 1., 2., 4., 8.],
[ 1., 0., 0., 0.],
[ 1., -1., 1., -1.]])
In [6]: f = np.array([[7],[29],[27],[-73]])
In [7]: np.linalg.solve(_5,f)
Out[7]:
array([[ 27.],
[ 55.],
[-39.],
[ 6.]])
The result is a (4,1) array; reverse rows with:
In [9]: _7[::-1]
Out[9]:
array([[ 6.],
[-39.],
[ 55.],
[ 27.]])
Negative strides, [::-1] indexing is also used to reverse Python lists and strings.
In [10]: ['a','b','c'][::-1]
Out[10]: ['c', 'b', 'a']

How do you efficiently sum the occurences of a value in one array at positions in another array

Im looking for an efficient 'for loop' avoiding solution that solves an array related problem I'm having. I want to use a huge 1Darray (A -> size = 250.000) of values between 0 and 40 for indexing in one dimension, and a array (B) with the same size with values between 0 and 9995 for indexing in a second dimension.
The result should be an array with size (41, 9996) with for each index the amount of times that any value from array 1 occurs at a value from array 2.
Example:
A = [0, 3, 2, 4, 3]
B = [1, 2, 2, 0, 2]
which should result in:
[[0, 1, 0,
[0, 0, 0,
[0, 0, 1,
[0, 0, 2,
[1, 0, 0]]
The dirty way is too slow as the amount of data is huge, what you would be able to do is:
out = np.zeros(41,9995)
for i in A:
for j in B:
out[i,j] += 1
which will take 238.000 * 238.000 loops...
I've tried this, which works partially:
out = np.zeros(41,9995)
out[A,B] += 1
Which generates a result with 1 everywhere, regardless of the amount of times the values occur.
Does anyone have a clue how to fix this? Thanks in advance!

You are looking for a sparse tensor:
import torch
A = [0, 3, 2, 4, 3]
B = [1, 2, 2, 0, 2]
idx = torch.LongTensor([A, B])
torch.sparse.FloatTensor(idx, torch.ones(idx.shape[1]), torch.Size([5,3])).to_dense()
Output:
tensor([[0., 1., 0.],
[0., 0., 0.],
[0., 0., 1.],
[0., 0., 2.],
[1., 0., 0.]])
You can also do the same with scipy sparse matrix:
import numpy as np
from scipy.sparse import coo_matrix
coo_matrix((np.ones(len(A)), (np.array(A), np.array(B))), shape=(5,3)).toarray()
output:
array([[0., 1., 0.],
[0., 0., 0.],
[0., 0., 1.],
[0., 0., 2.],
[1., 0., 0.]])
Sometimes it is better to leave the matrix in its sparse representation, rather than forcing it to be "dense" again.

Use numpy.add.at:
import numpy as np
A = [0, 3, 2, 4, 3]
B = [1, 2, 2, 0, 2]
arr = np.zeros((5, 3))
np.add.at(arr, (A, B), 1)
print(arr)
Output
[[0. 1. 0.]
[0. 0. 0.]
[0. 0. 1.]
[0. 0. 2.]
[1. 0. 0.]]

Given that the numbers are in a small range, bincount would be a good choice for bin-based summing -
def accumulate_coords(A,B):
nrows = A.max()+1
ncols = B.max()+1
return np.bincount(A*ncols+B,minlength=nrows*ncols).reshape(-1,ncols)
Sample run -
In [55]: A
Out[55]: array([0, 3, 2, 4, 3])
In [56]: B
Out[56]: array([1, 2, 2, 0, 2])
In [58]: accumulate_coords(A,B)
Out[58]:
array([[0, 1, 0],
[0, 0, 0],
[0, 0, 1],
[0, 0, 2],
[1, 0, 0]])

Numpy calculation of eigenvectors is incorrect

I run the following in Python and expected the columns in E[1] to be the eigenvectors of A, but they are not. Only Sympy.Matrix.eigenvects() seem to do it right. Why this error?
A
Out[194]:
matrix([[-3, 3, 2],
[ 1, -1, -2],
[-1, -3, 0]])
E = np.linalg.eig(A)
E
Out[196]:
(array([ 2., -4., -2.]),
matrix([[ -2.01889132e-16, 9.48683298e-01, 8.94427191e-01],
[ 5.54700196e-01, -3.16227766e-01, -3.71551690e-16],
[ -8.32050294e-01, 2.73252305e-17, 4.47213595e-01]]))
A*E[1] / E[1]
Out[205]:
matrix([[ 6.59900617, -4. , -2. ],
[ 2. , -4. , -3.88449298],
[ 2. , 8.125992 , -2. ]])

The eigenvectors are correct, within an expected margin of error.
What you discovered is that testing eigenvectors with element-wise division is a bad idea.
A better way is to compute the norm of the difference between matrix*vector and eigenvalue*vector.
NumPy performs computations in floating point arithmetics, limited to 52 bits of precision (double precision). This means any of its answers may contain numerical errors, at least of relative size 2**(-52) which is about 2e-16. So, when you see a number like 2e-16 coming from a calculation with numbers of size 1-3, the conclusion is: "that number should probably be zero, and the value we have for it is likely just noise". And if you divide by that number, noise is all you get.
SymPy, on the other hand, performs symbolic manipulations, so its answer (when it can get one) is exactly what the theory predicts.

From its docs:
The number w is an eigenvalue of a if there exists a vector v such that dot(a,v) = w * v. Thus, the arrays a, w, and v satisfy the equations dot(a[:,:], v[:,i]) = w[i] * v[:,i] for i \in {0,...,M-1}.
With your matrix:
In [1]: A = np.array([[-3, 3, 2],
...: [ 1, -1, -2],
...: [-1, -3, 0]])
...:
In [2]: w,v=np.linalg.eig(A)
In [3]: w
Out[3]: array([ 2., -4., -2.])
In [4]: v
Out[4]:
array([[ -9.39932874e-17, 9.48683298e-01, 8.94427191e-01],
[ 5.54700196e-01, -3.16227766e-01, 1.93473310e-16],
[ -8.32050294e-01, -4.08811066e-17, 4.47213595e-01]])
In [5]: np.dot(A,v)
Out[5]:
array([[ -2.22044605e-16, -3.79473319e+00, -1.78885438e+00],
[ 1.10940039e+00, 1.26491106e+00, -7.77156117e-16],
[ -1.66410059e+00, 4.44089210e-16, -8.94427191e-01]])
In [6]: w*v
Out[6]:
array([[ -1.87986575e-16, -3.79473319e+00, -1.78885438e+00],
[ 1.10940039e+00, 1.26491106e+00, -3.86946619e-16],
[ -1.66410059e+00, 1.63524427e-16, -8.94427191e-01]])
In [7]: np.dot(A,v)-w*v
Out[7]:
array([[ -3.40580301e-17, 8.88178420e-16, 2.22044605e-16],
[ 8.88178420e-16, -6.66133815e-16, -3.90209498e-16],
[ -2.22044605e-16, 2.80564783e-16, -3.33066907e-16]])
In [8]: np.allclose(np.dot(A,v), w*v)
Out[8]: True
So, yes, the documented test is satisfied, within floating point limits.
einsum can be used to highlight the i axis in the dot calculation.
In [10]: np.einsum('...k,ki->...i',A,v)
Out[10]:
array([[ -2.22044605e-16, -3.79473319e+00, -1.78885438e+00],
[ 1.10940039e+00, 1.26491106e+00, -7.77156117e-16],
[ -1.66410059e+00, 3.88578059e-16, -8.94427191e-01]])
When I divide by v (element wise), the result matches the eigenvalues, 2 -4,-2, except where v and the dot are virtually 0 (1e-16 or smaller).
In [11]: np.einsum('...k,ki->...i',A,v)/v
Out[11]:
array([[ 2.36234534, -4. , -2. ],
[ 2. , -4. , -4.01686475],
[ 2. , -9.50507681, -2. ]])

Triangular indexing and choice of summation axis for multidimensional arrays / matrices

I'm trying to solve a problem using multidimensional arrays, rather than resorting to for loops, in order to gain a performance boost, but am having trouble with the indexing.
I've tried various permutations using np.newaxis, but can't seem to achieve the following functionality.
Problem:
Part 1) Take an M x N x N array called a, and for each of the M square matrices, set the upper triangular matrix elements as their negative values.
Part 2) Sum all elements in each of the M matrices (of shape N X N), returning a 1D array with M elements. Let's call this array b.
Attempted Solution
Here is my MWP / attempt using loops (which does work, but I'd rather find a fully array/matrix-based approach
a = np.array(
[[[ 0, 1],
[ 5, 0]],
[[ 0, 3],
[ 2, 0]]])
Part 1):
triangular_upper_idx = np.triu_indices_from(a[0])
for i in range(len(a)):
a[i][triangular_upper_idx] *= -1
a
result:
array([[[ 0, -1],
[ 5, 0]],
[[ 0, -3],
[ 2, 0]]])
Part 2):
b = np.zeros(len(a))
for i in range(len(a)):
b[i] = np.sum(a[i])
b
result:
array([ 4., -1.])
Note:
I have seen a similar question on this topic (Triangular indices for multidimensional arrays in numpy) but the solution there was nested for loops... I feel like numpy may offer a more efficient, clever array-based solution?
Any guidance would be much appreciated.
Thanks

yes numpy has the tools
r = 2
neg_uppr = np.triu(-np.ones((r,r)),1) + np.tril(np.ones((r,r)))
can't tell from your numerical example if you want the diagonal to be inverted too? Then use np.triu(-np.ones((r,r))) + np.tril(np.ones((r,r)),-1)
neg_uppr
Out[23]:
array([[ 1., -1.],
[ 1., 1.]])
a = np.array(
[[[ 0, 1],
[ 5, 0]],
[[ 0, 3],
[ 2, 0]]])
its fast to use the builtin element-wise arithmetic
a = a * neg_uppr
a
Out[26]:
array([[[ 0., -1.],
[ 5., 0.]],
[[ 0., -3.],
[ 2., 0.]]])
you can specify axes to sum over:
np.sum(a, (1,2))
Out[27]: array([ 4., -1.])

How to split an array based on minimum row value using vectorization

I am trying to figure out how to take the following for loop that splits an array based on the index of the lowest value in the row and use vectorization. I've looked at this link and have been trying to use the numpy.where function but currently unsuccessful.
For example if an array has n columns, then all the rows where col[0] has the lowest value are put in one array, all the rows where col[1] are put in another, etc.
Here's the code using a for loop.
import numpy
a = numpy.array([[ 0. 1. 3.]
[ 0. 1. 3.]
[ 0. 1. 3.]
[ 1. 0. 2.]
[ 1. 0. 2.]
[ 1. 0. 2.]
[ 3. 1. 0.]
[ 3. 1. 0.]
[ 3. 1. 0.]])
result_0 = []
result_1 = []
result_2 = []
for value in a:
if value[0] <= value[1] and value[0] <= value[2]:
result_0.append(value)
elif value[1] <= value[0] and value[1] <= value[2]:
result_1.append(value)
else:
result_2.append(value)
print(result_0)
>>[array([ 0. 1. 3.]), array([ 0. 1. 3.]), array([ 0. 1. 3.])]
print(result_1)
>>[array([ 1. 0. 2.]), array([ 1. 0. 2.]), array([ 1. 0. 2.])]
print(result_2)
>>[array([ 3. 1. 0.]), array([ 3. 1. 0.]), array([ 3. 1. 0.])]

First, use argsort to see where the lowest value in each row is:
>>> a.argsort(axis=1)
array([[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[1, 0, 2],
[1, 0, 2],
[1, 0, 2],
[2, 1, 0],
[2, 1, 0],
[2, 1, 0]])
Note that wherever a row has 0, that is the smallest column in that row.
Now you can build the results:
>>> sortidx = a.argsort(axis=1)
>>> [a[sortidx[:,i] == 0] for i in range(a.shape[1])]
[array([[ 0., 1., 3.],
[ 0., 1., 3.],
[ 0., 1., 3.]]),
array([[ 1., 0., 2.],
[ 1., 0., 2.],
[ 1., 0., 2.]]),
array([[ 3., 1., 0.],
[ 3., 1., 0.],
[ 3., 1., 0.]])]
So it is done with only a single loop over the columns, which will give a huge speedup if the number of rows is much larger than the number of columns.

This is not the best solution since it relies on simple python loops and is not very efficient when you start dealing with large data sets but it should get you started.
The point is to create an array of "buckets" which store the data based on the depth of the lengthiest element. Then enumerate each element in values, selecting the smallest one and saving its offset which is subsequently appended to the correct results "bucket", for each a. Finally we print this out in the last loop.
Solution using loops:
import numpy
import pprint
# random data set
a = numpy.array([[0, 1, 3],
[0, 1, 3],
[0, 1, 3],
[1, 0, 2],
[1, 0, 2],
[1, 0, 2],
[3, 1, 0],
[3, 1, 0],
[3, 1, 0]])
# create a list of results as big as the depth of elements in an entry
results = list()
for l in range(max(len(i) for i in a)):
results.append(list())
# don't do the following because all the references to the lists will be the same and you get dups:
# results = [[]]*max(len(i) for i in a)
for value in a:
res_offset, _val = min(enumerate(value), key=lambda x: x[1]) # get the offset and min value
results[res_offset].append(value) # store the original Array obj in the correct "bucket"
# print for visualization
for c, r in enumerate(results):
print("result_%s: %s" % (c, r))
Outputs:
result_0: [array([0, 1, 3]), array([0, 1, 3]), array([0, 1, 3])]
result_1: [array([1, 0, 2]), array([1, 0, 2]), array([1, 0, 2])]
result_2: [array([3, 1, 0]), array([3, 1, 0]), array([3, 1, 0])]

I found a much easier way to do this. I hope that I am interpreting the OP correctly.
My sense is that the OP wants to create a slice of the larger array based upon some set of conditions.
Note that the code above to create the array does not seem to work--at least in python 3.5. I generated the array as follow.
a = np.array([0., 1., 3., 0., 1., 3., 0., 1., 3., 1., 0., 2., 1., 0., 2.,1., 0., 2.,3., 1., 0.,3., 1., 0.,3., 1., 0.]).reshape([9,3])
Next, I sliced the original array into smaller arrays. Numpy has builtins to help with this.
result_0 = a[np.logical_and(a[:,0] <= a[:,1],a[:,0] <= a[:,2])]
result_1 = a[np.logical_and(a[:,1] <= a[:,0],a[:,1] <= a[:,2])]
result_2 = a[np.logical_and(a[:,2] <= a[:,0],a[:,2] <= a[:,1])]
This will generate new numpy arrays that match the given conditions.
Note if the user wants to convert these individual rows into a list or arrays, he/she can just enter the following code to obtain the result.
result_0 = [np.array(x) for x in result_0.tolist()]
result_0 = [np.array(x) for x in result_1.tolist()]
result_0 = [np.array(x) for x in result_2.tolist()]
This should generate the outcome requested in the OP.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Difference between array and matrix numpy for solving linear equations - python

Related

Reversing output matrix values on Numpy! Is there a specific command for this?

How do you efficiently sum the occurences of a value in one array at positions in another array

Numpy calculation of eigenvectors is incorrect

Triangular indexing and choice of summation axis for multidimensional arrays / matrices

How to split an array based on minimum row value using vectorization

Categories

Resources