How do I calculate the matrix exponential of a sparse matrix? - python

I'm trying to find the matrix exponential of a sparse matrix:
import numpy as np
b = np.array([[1, 0, 1, 0, 1, 0, 1, 1, 1, 0],
[1, 0, 0, 0, 1, 1, 0, 1, 1, 0],
[0, 1, 1, 0, 1, 1, 0, 0, 1, 1],
[0, 0, 0, 0, 0, 1, 1, 1, 0, 0],
[1, 1, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 0, 0, 1, 0, 0, 1, 1],
[0, 0, 1, 0, 1, 0, 1, 1, 0, 0],
[1, 0, 0, 0, 1, 1, 0, 0, 1, 1],
[0, 0, 0, 0, 1, 0, 1, 1, 1, 0],
[0, 0, 0, 1, 0, 1, 1, 0, 0, 1]])
I can calculate this using scipy.linalg.expm, but it is slow for larger matrices.
from scipy.linalg import expm
S1 = expm(b)
Since this is a sparse matrix, I tried converting b to a scipy.sparse matrix and calling that function on the converted sparse matrix:
import scipy.sparse as sp
import numpy as np
sp_b = sp.csr_matrix(b)
S1 = expm(sp_b);
But I get the following error:
loop of ufunc does not support argument 0 of type csr_matrix which has no callable exp method
How can I calculate the matrix exponential of a sparse matrix?

You need to use scipy.sparse.linalg.expm for your sparse matrix instead of scipy.linalg.expm.
import scipy.sparse as sp
from scipy.sparse.linalg import expm
import numpy as np
b = np.array([[1, 0, 1, 0, 1, 0, 1, 1, 1, 0],
[1, 0, 0, 0, 1, 1, 0, 1, 1, 0],
[0, 1, 1, 0, 1, 1, 0, 0, 1, 1],
[0, 0, 0, 0, 0, 1, 1, 1, 0, 0],
[1, 1, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 0, 0, 1, 0, 0, 1, 1],
[0, 0, 1, 0, 1, 0, 1, 1, 0, 0],
[1, 0, 0, 0, 1, 1, 0, 0, 1, 1],
[0, 0, 0, 0, 1, 0, 1, 1, 1, 0],
[0, 0, 0, 1, 0, 1, 1, 0, 0, 1]])
sp_b = sp.csr_matrix(b)
S1 = expm(sp_b);
Note: As you found, defining your matrix as a CSR matrix gives the warning "SparseEfficiencyWarning: spsolve is more efficient when sparse b is in the CSC matrix format". To get rid of this, you can do as the warning suggests, and define a CSC matrix if that makes sense for your application:
sp_b = sp.csc_matrix(b)

Related

Labeling via majority vote of connected clusters in python3

I have a tensor with three dimensions and three classes (0: background, 1: first class, 2: second class). I would like to find connected clusters and assign outlier's labels by performing a majority vote. A 2D example:
import numpy as np
data = np.array([[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 1, 2],
[1, 2, 0, 0, 2, 2, 2],
[0, 1, 0, 0, 0, 2, 0],
[0, 0, 0, 0, 0, 0, 0],])
should be changed to
data = np.array([[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 2, 2],
[1, 1, 0, 0, 2, 2, 2],
[0, 1, 0, 0, 0, 2, 0],
[0, 0, 0, 0, 0, 0, 0],])
It is enough to see connected regions as one cluster an count the appearence of the labels. I am not looking for any machine learning method.
You can use scipy.ndimage.measurements.label to find the connected components and then use np.bincount for the counting
from scipy.ndimage import measurements
lbl,ncl = measurements.label(data)
lut = np.bincount((data+2*lbl).ravel(),None,2*ncl+3)[1:].reshape(-1,2).argmax(1)+1
lut[0] = 0
lut[lbl]
# array([[0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0],
# [0, 1, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 2, 2],
# [1, 1, 0, 0, 2, 2, 2],
# [0, 1, 0, 0, 0, 2, 0],
# [0, 0, 0, 0, 0, 0, 0]])

SymPy rref() returning an identity matrix for a singular matrix

import numpy
import sympy
n = 7
k = 3
X = numpy.random.randn(n,k)
Px = X#numpy.linalg.inv(numpy.transpose(X)#X)#numpy.transpose(X) #X(X'X)^(-1)X'
print(sympy.Matrix(Px).rref())
As you may verify yourself, Px is singular. However, sympy.rref() returns this:
(Matrix([[1, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 0, 1]]), (0, 1, 2, 3, 4, 5, 6))
Why doesn't it return the real rref? I read somewhere I could pass simplify=True, however it didn't make any difference.
In [49]: Px
Out[49]:
array([[ 0.5418898 , 0.44245552, 0.04973693, -0.06834885, -0.19086119,
-0.07003176, 0.06325021],...
[ 0.06325021, -0.11080081, 0.21656224, -0.07445145, -0.28634725,
0.06648907, 0.19199866]])
In [50]: np.linalg.det(Px)
Out[50]: 2.141647537907433e-67
In [51]: np.linalg.inv(Px)
Out[51]:
array([[-7.18788695e+15, 4.95655702e+15, 7.52738018e+15,
-4.40875311e+15, -1.64015565e+16, 2.63785320e+15,
-3.03465003e+16],
[ 1.59176426e+16, ....
[ 3.31636798e+16, -3.39094560e+16, -3.60287970e+16,
-1.27160460e+16, 2.14338015e+16, 3.32345350e+15,
3.60287970e+16]])
Your Px is close to singular, but not exactly so. Contrast that with
In [52]: M = np.arange(9).reshape(3,3)
In [53]: np.linalg.det(M)
Out[53]: 0.0
In [55]: np.linalg.inv(M)
LinAlgError: Singular matrix
In [56]: sympy.Matrix(M).rref()
Out[56]:
(Matrix([
[1, 0, -1],
[0, 1, 2],
[0, 0, 0]]), (0, 1))
Numerically speaking your Px is not singular, just close:
In [57]: sympy.Matrix(Px).rref()
Out[57]:
(Matrix([
[1, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 0, 1]]), (0, 1, 2, 3, 4, 5, 6))
But with a custom iszerofunc:
In [58]: sympy.Matrix(Px).rref(iszerofunc=lambda x: abs(x)<1e-16)
Out[58]:
(Matrix([
[1, 0, 0, 0.647383887198708, -1.91409951634531, -1.43377991000974, 0.578981680134581],
[0, 1, 0, -0.839184067893959, 1.88998490600173, 1.43367640627271, -0.611620902311026],
[0, 0, 1, -0.962221703397948, 0.203783478612254, 1.45929622452135, 0.404548167005728],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0]]),
(0, 1, 2))

What is an efficient way of counting groups of ones on 2D grid in python? [Figures below]

I have a few hundred 2D numpy arrays. They contain zeros and ones. A few examples with plots, yellow indicates ones, purple indicates zeros:
grid1=np.array([[1, 1, 0, 0, 1, 1, 0, 0],
[1, 1, 0, 1, 1, 1, 0, 0],
[1, 1, 0, 1, 1, 1, 0, 0],
[1, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0]])
plt.imshow(grid1)
grid2=np.array([[1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 0, 0, 0],
[1, 1, 1, 1, 1, 0, 0, 0],
[1, 0, 0, 1, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0]])
plt.imshow(grid2)
grid3=np.array([[1, 1, 0, 0, 1, 0, 0, 1],
[0, 1, 0, 1, 1, 0, 1, 1],
[0, 1, 0, 1, 1, 0, 0, 0],
[0, 0, 0, 0, 1, 1, 1, 0],
[1, 1, 1, 0, 0, 0, 0, 0]])
plt.imshow(grid3)
I'm looking for an efficient way to count the number of yellow blobs on the images. 2, 1 and 4 blobs on the images above, from top to bottom.
Is there an easy way to do this or I have to check each yellow bit to be in the same blob as all the other yellows, and write a script for that? (That looks very painful.)
scipy.ndimage.measurements.label does what you need.

alternating values in numpy

Trying to make my code more efficient and readable and i'm stuck. Assume I want to build something like a chess board, with alternating black and white colors on an 8x8 grid. So, using numpy, I have done this:
import numpy as np
board = np.zeros((8,8), np.int32)
for ri in range(8):
for ci in range(8):
if (ci + ri) % 2 == 0:
board[ri,ci] = 1
Which nicely outputs:
array([[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1]], dtype=int32)
That I can then parse as white squares or black squares. However, in practice my array is much larger, and this way is very inefficient and unreadable. I assumed numpy already has this figured out, so I tried this:
board = np.zeros(64, np.int32)
board[::2] = 1
board = board.reshape(8,8)
But that output is wrong, and looks like this:
array([[1, 0, 1, 0, 1, 0, 1, 0],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 0, 1, 0, 1, 0, 1, 0]], dtype=int32)
Is there a better way to achieve what I want that works efficiently (and preferably, is readable)?
Note: i'm not attached to 1's and 0's, this can easily be done with other types of values, even True/False or strings of 2 kinds, as long as it works
Here's one approach using slicing with proper starts and stepsize of 2 in two steps -
board = np.zeros((8,8), np.int32)
board[::2,::2] = 1
board[1::2,1::2] = 1
Sample run -
In [229]: board = np.zeros((8,8), np.int32)
...: board[::2,::2] = 1
...: board[1::2,1::2] = 1
...:
In [230]: board
Out[230]:
array([[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1]], dtype=int32)
Other tricky ways -
1) Broadcasted comparison :
In [254]: r = np.arange(8)%2
In [255]: (r[:,None] == r)*1
Out[255]:
array([[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1]])
2) Broadcasted addition :
In [279]: r = np.arange(8)
In [280]: 1-(r[:,None] + r)%2
Out[280]:
array([[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1]])
Just found out an alternative answer by myself, so posting it here for future reference to anyone who's interested:
a = np.array([[1,0],[0,1]])
b = np.tile(a, (4,4))
Results:
array([[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1]])
I think the following is also a good way of doing it for a variable input
import sys
lines = sys.stdin.readlines()
n = int(lines[0])
import numpy as np
a = np.array([[1,0], [0,1]],dtype=np.int)
outputData= np.tile(a,(n//2,n//2))
print(outputData)
You can achieve this for single even input number n
import numpy as np
i = np.eye(2)
i = i[::-1]
k = np.array(i, dtype = np.int)
print(np.tile(k,(n//2,n//2)))
I tried and found this to be shorter one for any giver number:
n = int(input())
import numpy as np
c = np.array([[0,1], [1, 0]])
print(np.tile(c, reps=(n//2, n//2)))

How to get the length of repeated numbers column wise?

I am trying to get the length of repeated numbers in Python Numpy. For example, let's consider a simple ndarray
import numpy as np
a = np.array([
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 0, 0, 1, 1, 1, 0, 1],
[0, 1, 0, 1, 0, 1, 0, 0, 1, 0],
[1, 1, 0, 0, 1, 1, 1, 1, 0, 0],
])
The first column has [0, 1, 0, 1], the position of 1 is 1, now start counting from there, we get ones = 2 and zeros = 1. So I have to start counting ones and zeros when 1 is encountered (starting position).
so the answer for a would be
ones = [2, 2, 1, 1, 1, 3, 2, 2, 1, 1]
zeros = [1, 0, 2, 1, 0, 0, 1, 1, 1, 2]
Can any one please help me out?
Update
3D array:
a = np.array([
[
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 1, 1, 1, 0, 0],
[0, 1, 0, 0, 0, 1, 0, 0, 1, 0],
[1, 1, 0, 0, 1, 1, 1, 1, 0, 0],
],
[
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 1, 0, 0, 0, 1, 1],
[0, 1, 0, 1, 0, 0, 0, 1, 0, 0],
[1, 1, 0, 1, 0, 1, 1, 1, 0, 0],
]
])
The expected output should be
ones = [
[2, 3, 0, 0, 1, 3, 2, 2, 1, 0],
[1, 3, 0, 2, 1, 1, 1, 2, 1, 1]
]
zeros = [
[1, 0, 0, 0, 0, 0, 1, 1, 1, 0],
[0, 0, 0, 0, 2, 0, 0, 0, 2, 2]
]
With focus on performance, here's one generic approach for ndarrays -
ones_count = a.sum(-2)
zeros_count = (a.shape[-2] - ones_count - a.argmax(-2))*a.any(-2)
One alternative to get zeros_count with selections using np.where, would be -
zeros_count = np.where(a.any(-2),a.shape[-2] - ones_count - a.argmax(-2),0)
Sample runs
2D case :
In [60]: a
Out[60]:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 0, 0, 1, 1, 1, 0, 1],
[0, 1, 0, 1, 0, 1, 0, 0, 1, 0],
[1, 1, 0, 0, 1, 1, 1, 1, 0, 0]])
In [61]: ones_count = a.sum(-2)
...: zeros_count = (a.shape[-2] - ones_count - a.argmax(-2))*a.any(-2)
...:
In [62]: ones_count
Out[62]: array([2, 2, 1, 1, 1, 3, 2, 2, 1, 1])
In [63]: zeros_count
Out[63]: array([1, 0, 2, 1, 0, 0, 1, 1, 1, 2])
3D case :
In [65]: a = np.array([
...: [
...: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
...: [1, 1, 0, 0, 0, 1, 1, 1, 0, 0],
...: [0, 1, 0, 0, 0, 1, 0, 0, 1, 0],
...: [1, 1, 0, 0, 1, 1, 1, 1, 0, 0],
...: ],
...: [
...: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
...: [0, 1, 0, 0, 1, 0, 0, 0, 1, 1],
...: [0, 1, 0, 1, 0, 0, 0, 1, 0, 0],
...: [1, 1, 0, 1, 0, 1, 1, 1, 0, 0],
...: ]
...: ])
In [66]: ones_count = a.sum(-2)
...: zeros_count = (a.shape[-2] - ones_count - a.argmax(-2))*a.any(-2)
...:
In [67]: ones_count
Out[67]:
array([[2, 3, 0, 0, 1, 3, 2, 2, 1, 0],
[1, 3, 0, 2, 1, 1, 1, 2, 1, 1]])
In [68]: zeros_count
Out[68]:
array([[1, 0, 0, 0, 0, 0, 1, 1, 1, 0],
[0, 0, 0, 0, 2, 0, 0, 0, 2, 2]])
and so on for higher dim arrays.

Categories