Counting the number of components/objects in an array - python

I am wondering what is a good way to count the number of components in a matrix. Let's say that ones constitute a component and zeros constitute the background. So in the array below there are 4 components:
a = np.array([
[1,1,0,0,0,0,0,0],
[1,1,1,0,0,0,1,0],
[0,1,0,0,0,1,1,1],
[0,0,0,0,0,1,1,1],
[0,1,0,0,0,0,1,0],
[0,0,0,1,1,0,0,0],
[0,0,1,1,0,0,0,0]])
I know that I can do this with Scipy like this:
from scipy.ndimage import label
labeled_array, num_features = label(a)
where num_features will give me the correct answer of 4.
How would I implement this myself? I am asking what the correct technique would be. Preferably I want to implement this with matrix operations (e.g. a Numpy solution). So no for-loops where I am checking every value individually.
I am asking this because in the end I want to implement the solution in Tensorflow such that the whole thing is differentiable. This way I can add the number of components as a term in my loss function for an image segmentation problem, which is my end goal.
I thought of using morphological erosion, to shrink the components in the matrix until only 1s remain. Then I could just take the sum of the array to count the number of components. Unfortunately if I erode an isolated one [[0,0,0],[0,1,0],[0,0,0]] it will remove it ([[0,0,0],[0,0,0],[0,0,0]]). If I repeat erosion it will continue until there are no components left. I also think I could use skeletonization, which garantuees that there would always remain a center-point? But I am not sure if that is the right technique, and how I would implement that.
I was wondering if anyone has any ideas on this problem or knows how to solve it. Any input would be greatly appreciated!

This can be done in linear time using the flood-fill algorithm. Here is an example:
count = 0
for i in range(a.shape[0]):
for j in range(a.shape[1]):
if a[i, j] == 1:
floodFill(a, i, j)
count += 1
While this solution use loops, you can use Numba or Cython to mitigate the cost of the function calls and the loops. The resulting code should be very fast.
There is no flood fill algorithm in Numpy but there is one in several image packages like OpenCV or scikit-image for example. You can write your own implementation as long as you use Cython or Numba to speed it up.
Actually, this operation is called segmentation and AFAIK it can be done directly in OpenCV in one pass. Once applied, you could easily count the labelled area using Numpy operation (with something like count = len(np.unique(segmentedResult.flatten())).
I highly doubt once can write an efficient code using only matrix-based Numpy functions here. The morphological erosion method would take a quadratic time assuming it could work correctly. A good scalar algorithm can sometimes outperform any vectorized inefficient algorithm.

Related

Kalman Filtering in Python

I've been trying to work on designing a Kalman Filter for a few weeks now, but I'm pretty sure I'm making a major error because my results are terrible. My common sense tells me it's because I'm using an already-existing matrix as my predicted state instead of using a transition matrix, but I'm not sure how to solve that if it indeed is the issue. By the way, this is my first time using Kalman Filtering, so I may be missing basic stuff.
Here is a detailed explanation:
I have 2 datasets of 81036 observations each, with each observation including 6 datapoints (i.e., I end up with 2 matrices of shape 81036 x 6). The first dataset is the measured state and the other one is the predicted state. I want to end up with a Python code that filters the data using both states, and I need the final covariance and error estimates. Here's the main part of my code:
import numpy as np
#nb of observations
nn=81036
#nb of datapoints
ns=6
#import
ps=np.genfromtxt('.......csv', delimiter=',')
ms=np.genfromtxt('.......csv', delimiter=',')
##kalman filtering with covariance
#initialize data (lazy initialization using means of columns)
xi=np.mean(ms,axis=0)
for i in np.arange(nn):
#errors
d=ms[i,:]-xi
d2=ps[i,:]-xi
#covariance matrices
P=np.zeros((ns,ns))
R=np.zeros((ns,ns))
for j in np.arange(ns):
for s in np.arange(ns):
P[j,s]=d[j]*d[s]
R[j,s]=d2[j]*d2[s]
#Gain
k=P*(P+R)**-1
#Update estimate
xi=xi+np.matmul(k,d2)
#Uncertainty/error
I=np.identity(ns)
mlt=np.matmul((I-k),P)
mlt=np.matmul(mlt,((I-k).T))
mlt2=np.matmul(k,R)
mlt2=np.matmul(mlt2,k.T)
Er=mlt+mlt2
When I run this code, I end up with my filtered state xi going through the roof, so I'm pretty sure this is not the correct code. I've tried to fix it in several ways (e.g., I tried to calculate the covariance matrix in the standard way I'm used to - D'D/n -, I tried to remove my predicted state matrix and simply add random noise to my measured state instead...), but nothing seems to work. I also tried some available libraries for Kalman Filtering (as well as libraries in Matlab and R), but they either work in 1D only or need me to specify variables like the transitional matrix, which I don't have. I'm at the end of my wits here, so I'd appreciate any help.
I've found the solution to this issue. Huge props to Kani for their comment, as it pointed me in the right direction.
It turns out that the issue is simply in the calculation of k. Although the equation is correct, the inverse function was not working properly because of the very small values in some instances of R and P. To solve this, I used the pseudoinverse instead, so the line for calculating k became as follows:
k = P # np.linalg.pinv(P + R)
Note that this might not be as accurate as the inverse in other cases, but it does the trick here.

What is the most efficient way to calculate the eigen values of a large covariance matrix?

I have been trying for some days to calculate the nearest positive semi-definite matrix from a very large covariance matrix to be able to sample from it.
I have tried MATLAB for the effect, but the memory usage is insane and it always crashes eventually, no error message or log file as far as I searched. The function used for the calculation can be found here https://www.mathworks.com/matlabcentral/fileexchange/42885-nearestspd. Optimizing the function to remove intermediate matrices seemed to reduce the memory usage, but it eventually crashes much in the same way.
Found this approach for doing the calculation https://stackoverflow.com/a/63131309/18660401 and switched to Python, in hopes of finding some GPU libraries to accelerate the calculations, but it seems I cannot find an up-to-date library that suports calculating eigenvectors using the numpy function. This is the function I am using:
import numpy as np
def get_near_psd(A):
C = (A + A.T)/2
eigval, eigvec = np.linalg.eig(C)
eigval[eigval < 0] = 0
return eigvec.dot(np.diag(eigval)).dot(eigvec.T)
I am currently trying to run the same function with numba in hopes that the translation to LLVM is enough to make the calculations in reasonable time, only modified the above version to include the #jit decorator from numba.
There does not seem to be a very optimized way to do this as far as I can find on my own, so any suggestion is very appreciated to crack this.
Edit: The matrix is a two-dimensional 60416x60416 covariance matrix and it is to be used to generate new samples from the distribution of the mean and covariance matrix calculated from a set of samples using a GAN. For training purposes, samples also need to be generated from randomly sampling the distribution, which I am intending to use the function multivariate_normal from numpy for.
A very up to date library that does have these capabilities including GPU support is pytorch, check out the examples on the torch.linalg.eig-function and the corresponding accelerated function torch.linalg.eigh for symmetric/hermitian matrices. You do have to convert these matrices from numpy to pytorch-tensors first to do the computation (and then convert it back), but you can definitely use it in a very similar way.
Of course also this library can't just magically give you more memory, but it is highly optimized.

Fastest way in Python to determine if a 4x4 matrix has complex eigenvalues

I am working with a 4x4 matrix which, in general, has complex valued-elements. I am trying to determine if there exists a non-real eigenvalue for this matrix; I do not necessarily care what the eigenvalue is. My current algorithm for the numpy array A (which is pre-defined by me) is as follows:
import scipy.linalg as SciLA
import numpy as np
import mpmath as mp
w1 = SciLA.eigvals(A)
w2 = [mp.chop(i,tol=1e-14) for i in w1]
imag_list = [(np.imag(w2[i])) for i in range(0,len(w1))]
imag_num = np.sign(len([x for x in imag_list if x != 0]))
Using %timeit, the code takes around 1.43 ms per loop (after testing over 1000 loops) for a simple 4x4 matrix. However, I feel that there should be a simpler way of just checking if a certain matrix has complex eigenvalues. I also need the code to go faster, as I am looping over many 4x4 matrices. Any suggestions for possible packages or mathematical/numerical techniques to aid in simplifying the code and/or speeding it up would be greatly appreciated.
As per my comment above, I am going to assume that you are looking to see if any if the values is non-real, i.e. has a non-null imaginary part. This isn't strictly a solution, but I'm guessing it'll be close enough for what you want:
The trace of a matrix is the sum of its eigenvalues. If this trace is non-real, certainly at least one of these eigenvalues must be non-real. So just check if this is the case, and if so you can be sure that there is a non-real eigenvalue.
This condition isn't perfect, of course, one can easily find matrices with a real trace but some non-real eigenvalues. Therefore, if the trace is real, you should fall back to the method above to figure out whether the eigenvalues are all real or not. However, for most applications, most matrices will have a non-real trace, and so your execution time should be much shorter since all you need to compute is the trace.
I guess what is make your code bad is that you are building three lists from the response and using loops for that. Use numpy vectorized operations instead
# This will tell you if all the eigenvalues are (nearly) real
np.allclose(SciLA.eigvals(A).imag, 0)

Fast subsequent multiplication of many matrices in python

I have to generate a matrix (propagator in physics) by ordered multiplication of many other matrices. Each matrix is about the size of (30,30), all real entries (floats), but not symmetric. The number of matrices to multiply varies between 1e3 to 1e5. Each matrix is only slightly different from previous, however they are not commutative (and at the end I need the product of all these non-commutative multiplication). Each matrix is for certain time slice, so I know how to generate each of them independently, wherever they are in the multiplication sequence. At the end, I have to produce many such matrix propagators, so any performance enhancement is welcomed.
What is the algorithm for fastest implementation of such matrix multiplication in python?
In particular -
How to structure it? Are there fast axes and so on? preferable dimensions for rows / columns of the matrix?
Assuming memory is not a problem, to allocate and build all matrices before multiplication, or to generate each per time step? To store each matrix in dedicated variable before multiplication, or to generate when needed and directly multiply?
Cumulative effects of function call overheads effects when generating matrices?
As I know how to build each, should it be parallelized? For example maybe to create batch sequences from start of the sequence and from the end, multiply them in parallel and at the end multiply the results in proper order?
Is it preferable to use other module than numpy? Numba can be useful? or some other efficient way to compile in place to C, or use of optimized external libraries? (please give reference if so, I don't have experience in that)
Thanks in advance.
I don't think that the matrix multiplication would take much time. So, I would do it in a single loop. The assembling is probably the costly part here.
If you have bigger matrices, a map-reduce approach could be helpful. (split the set of matrices, apply matrix multiplication to each set and do the same for the resulting matrices)
Numpy is perfectly fine for problems like this as it is pretty optimized. (and is partly in C)
Just test how much time the matrix multiplication takes and how much the assembling. The result should indicate where you need to optimize.

Efficient way to solve matrix equation in Python

Right now I am using the numpy.linalg.solve to solve my matrix, but the fact that I am using it to solve a 5000*17956 matrix makes it really time consuming. It runs really slow and It have taken me more than an hour to solve. The running time for this is probably O(n^3) for solving matrix equation but I never thought it would be that slow. Is there any way to solve it faster in Python?
My code is something like that, to solve a for the equation BT * UT = BT*B a, where m is the number of test cases (in my case over 5000), B is a data matrix m*17956, and u is 1*m.
C = 0.005 # hyperparameter term for regulization
I = np.identity(17956) # 17956*17956 identity matrix
rhs = np.dot(B.T, U.T) # (17956*m) * (m*1) = 17956*1
lhs = np.dot(B.T, B)+C*I # (17956*m) * (m*17956) = 17956*17956
a = np.linalg.solve(lhs, rhs) # B.T u = B.T B a, solve for a (17956*1)
Update (2 July 2018): The updated question asks about the impact of a regularization term and the type of data in the matrices. In general, this can make a large impact in terms of the datatypes a particular CPU is most optimized for (as a rough rule of thumb, AMD is better with vectorized integer math and Intel is better with vectorized floating point math when all other things are held equal), and the presence of a large number of zero values can allow for the use of sparse matrix libraries. In this particular case though, the changes on the main diagonal (well under 1% of all the values in consideration) will have a negligible impact in terms of runtime.
TLDR;
An hour is reasonable (a cubic regression suggests that this would take around 83 minutes on my machine -- a low-end chromebook).
The pre-processing to generate lhs and rhs account for almost none of that time.
You won't be able to solve that exact problem much faster than with numpy.linalg.solve.
If m is small as you suggest and if B is invertible, you can instead solve the equation U.T=Ba in a minute or less.
If this is part of a larger problem, this costly intermediate step might be able to be simplified away from a mathematical framework.
Performance bottlenecks really should be addressed with profiling to figure out which step is causing the issues.
Since this comes from real-world data, you might be able to get away with fewer features (either directly or through a reduction step like PCA, NMF, or LLE), depending on the end goal.
As mentioned in another answer, if the matrix is sufficiently sparse you can get away with sparse linear algebra routines to great effect (many natural language processing data sources are like this).
Since the output is a 1D vector, I would use np.dot(U, B).T instead of np.dot(B.T, U.T). Transposes are neat that way. This avoids doing the transpose on a big matrix like B, though since you have a cubic operation as the dominant step this doesn't matter much for your problem.
Depending on whether you need the original data anymore and if the matrices involved have any other special properties, you might be able to fiddle with the parameters in scipy.linalg.solve instead for a gain.
I've had mixed success replacing large matrix equations with block matrix equations falling back on numpy routines. That approach typically saves 5-20% over numpy approaches and takes 1% or so off scipy approaches on my system. I haven't fully explored the reason for the discrepancy.
Assuming your matrix is sparse, the scipy.sparse.linalg module will be useful. Here is the documentation for the whole module, and here is the documentation for spsolve.

Categories