Vectorizing or boosting time for an interpolation in Python - python

I have to boost the time for an interpolation over a large (NxMxT) matrix MTR, where:
N is about 8000;
M is about 10000;
T represents the number of times at which each NxM matrix is calculated (in my case it's 23).
I have to compute the interpolation element-wise, on all the T different times, and return the interpolated values over a different array of times (T_interp, in my case with lenght 47) so, as output, I want an NxMxT_interp matrix.
The code snippet below defines the function I built for the interpolation, using scipy.interpolate.Rbf (y is the array MTR[i,j,:], x is the times array with length T, x_interp is the new array of times with length T_interp:
#==============================================================================
# Interpolate without nans
#==============================================================================
def interp(x,y,x_interp,**kwargs):
import numpy as np
from scipy.interpolate import Rbf
mask = np.isnan(y)
y_mask = np.ma.array(y,mask = mask)
x_new = [x[i] for i in np.where(~mask)[0]]
if len(y_mask.compressed()) == 0:
return [np.nan for i,n in enumerate(x_interp)]
elif len(y_mask.compressed()) == 1:
return [y_mask.compressed() for i,n in enumerate(x_interp)]
interp = Rbf(x_new,y_mask.compressed(),**kwargs)
y_interp = interp(x_interp)
return y_interp
I tried to achieve my goal either by looping over the NxM elements of the MTR matrix:
new_MTR = np.empty((N,M,T_interp))
for i in range(N):
for j in range(M):
new_MTR[i,j,:]=interp(times,MTR[i,j,:],New_times,function = 'linear')
or by using the np.apply_along_axis funtion:
new_MTR = np.apply_along_axis(lambda x: interp(times,x,New_times,function = 'linear'),2,MTR)
In both cases I extimated the time it takes to perform the whole operation and it appears to be slightly better for the np.apply_along_axis function, but still it will take about 15 hours!!
Is there a way to reduce this time? Maybe by vectorizing the entire operation? I don't know much about vectorizing and how it can be done in a situation like mine so any help would be much appreciated. Thank you!

Related

Interpolate Image for given indices python

I've an image of about 8000x9000 size as a numpy matrix. I also have a list of indices in a numpy 2xn matrix. These indices are fractional as well as may be out of image size. I need to interpolate the image and find the values for the given indices. If the indices fall outside, I need to return numpy.nan for them. Currently I'm doing it in for loop as below
def interpolate_image(image: numpy.ndarray, indices: numpy.ndarray) -> numpy.ndarray:
"""
:param image:
:param indices: 2xN matrix. 1st row is dim1 (rows) indices, 2nd row is dim2 (cols) indices
:return:
"""
# Todo: Vectorize this
M, N = image.shape
num_indices = indices.shape[1]
interpolated_image = numpy.zeros((1, num_indices))
for i in range(num_indices):
x, y = indices[:, i]
if (x < 0 or x > M - 1) or (y < 0 or y > N - 1):
interpolated_image[0, i] = numpy.nan
else:
# Todo: Do Bilinear Interpolation. For now nearest neighbor is implemented
interpolated_image[0, i] = image[int(round(x)), int(round(y))]
return interpolated_image
But the for loop is taking huge amount of time (as expected). How can I vectorize this? I found scipy.interpolate.interp2d, but I'm not able to use it. Can someone explain how to use this or any other method is also fine. I also found this, but again it is not according to my requirements. Given x and y indices, these generated interpolated matrices. I don't want that. For the given indices, I just want the interpolated values i.e. I need a vector output. Not a matrix.
I tried like this, but as said above, it gives a matrix output
f = interpolate.interp2d(numpy.arange(image.shape[0]), numpy.arange(image.shape[1]), image, kind='linear')
interp_image_vect = f(indices[:,0], indices[:,1])
RuntimeError: Cannot produce output of size 73156608x73156608 (size too large)
For now, I've implemented nearest-neighbor interpolation. scipy interp2d doesn't have nearest neighbor. It would be good if the library function as nearest neighbor (so I can compare). If not, then also fine.
It looks like scipy.interpolate.RectBivariateSpline will do the trick:
from scipy.interpolate import RectBivariateSpline
image = # as given
indices = # as given
spline = RectBivariateSpline(numpy.arange(M), numpy.arange(N), image)
interpolated = spline(indices[0], indices[1], grid=False)
This gets you the interpolated values, but it doesn't give you nan where you need it. You can get that with where:
nans = numpy.zeros(interpolated.shape) + numpy.nan
x_in_bounds = (0 <= indices[0]) & (indices[0] < M)
y_in_bounds = (0 <= indices[1]) & (indices[1] < N)
bounded = numpy.where(x_in_bounds & y_in_bounds, interpolated, nans)
I tested this with a 2624x2624 image and 100,000 points in indices and all told it took under a second.

Scipy: epsilon neighborhood by sparse similarity with threshold

I am wondering if scipy offers the option to implement a primitive but memory-friendly approach to epsilon neighborhood search:
Compute pairwise similarity for my data, but set all similarities smaller than a threshold epsilon to zero on the fly and then output result directly as sparse matrix.
For example scipy.spatial.distance.pdist() is really fast, but the memory limit is reached early compared to my time limit, at least if I take squareform().
I know there are O(n*log(n)) solutions in this case but for now it would be enough if the result could be sparse. Also obviously I would have to use a similarity as opposed to a distance, but that should not be such a big problem, should it.
As long as you can recast your similarity measure in terms of a distance metric (say 1 minus the similarity) then the most efficient solution is to use sklearn's BallTree.
Otherwise you could build a your own scipy.sparse.csr_matrix matrix by comparing each point against the other $ i -1$ points and throwing away all values smaller than the threshold.
Without knowing your specific similarity metric, this code should roughly do the trick:
import scipy.sparse as spsparse
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np
def sparse_similarity(X, epsilon=0.99, Y=None, similarity_metric=cosine_similarity):
'''
X : ndarray
An m by n array of m original observations in an n-dimensional space.
'''
Nx, Dx = X.shape
if Y is None:
Y=X
Ny, Dy = Y.shape
assert Dx==Dy
data = []
indices = []
indptr = [0]
for ix in range(Nx):
xsim = similarity_metric([X[ix]], Y)
_ , kept_points = np.nonzero(xsim>=epsilon)
data.extend(xsim[0,kept_points])
indices.extend(kept_points)
indptr.append(indptr[-1] + len(kept_points))
return spsparse.csr_matrix((data, indices, indptr), shape=(Nx,Ny))
X = np.random.random(size=(1000,10))
sparse_similarity(X, epsilon=0.95)

vectorized / linear algebra distance between points?

Suppose I have an array of points,
import numpy as np
pts = np.random.rand(100,3) # 1000 points, X, Y, Z along second dimension
The naive approach to calculate the distance between each combination of points involves a double for loop and will be excruciatingly slow for large numbers of points,
def euclidian_distance(p1, p2):
d = p2 - p1
return np.sqrt(d**2).sum()
out = np.empty((pts.shape[0], pts.shape[0]))
pts_swapped = pts.swapaxes(0,1)
for idx, point in enumerate(pts_swapped):
for idx2, point_inner in enumerate(pts_swapped):
out[idx,idx2] = euclidian_distance(point, point_inner)
How do I vectorize this calculation?
Take a look at the scipy.spatial.distance.cdist. I'm not sure but i assume that scipy optimized this quite a lot. If you use the pts array for both inputs, I assume you'll get an M x M array with zeros on the diagonal . function

Optimizing histogram distance metric for two matrices in Python

I have two matrices A and B, each with a size of NxM, where N is the number of samples and M is the size of histogram bins. Thus, each row represents a histogram for that particular sample.
What I would like to do is to compute the chi-square distance between two matrices for a different pair of samples. Therefore, each row in the matrix A will be compared to all rows in the other matrix B, resulting a final matrix C with a size of NxN and C[i,j] corresponds to the chi-square distance between A[i] and B[j] histograms.
Here is my python code that does the job:
def chi_square(histA,histB):
esp = 1.e-10
d = sum((histA-histB)**2/(histA+histB+eps))
return 0.5*d
def matrix_cost(A,B):
a,_ = A.shape
b,_ = B.shape
C = zeros((a,b))
for i in xrange(a):
for j in xrange(b):
C[i,j] = chi_square(A[i],B[j])
return C
Currently, for a 100x70 matrix, this entire process takes 0.1 seconds.
Is there any way to improve this performance?
I would appreciate any thoughts or recommendations.
Thank you.
Sure! I'm assuming you're using numpy?
If you have the RAM available, you could use broadcast the arrays and use numpy's efficient vectorization of the operations on those arrays.
Here's how:
Abroad = A[:,np.newaxis,:] # prepared for broadcasting
C = np.sum((Abroad - B)**2/(Abroad + B), axis=-1)/2.
Timing considerations on my platform show a factor of 10 speed gain compared to your algorithm.
A slower option (but still faster than your original algorithm) that uses less RAM than the previous option is simply to broadcast the rows of A into 2D arrays:
def new_way(A,B):
C = np.empty((A.shape[0],B.shape[0]))
for rowind, row in enumerate(A):
C[rowind,:] = np.sum((row - B)**2/(row + B), axis=-1)/2.
return C
This has the advantage that it can be run for arrays with shape (N,M) much larger than (100,70).
You could also look to Theano to push the expensive for-loops to the C-level if you don't have the memory available. I get a factor 2 speed gain compared to the first option (not taking into account the initial compile time) for both the (100,70) arrays as well as (1000,70):
import theano
import theano.tensor as T
X = T.matrix("X")
Y = T.matrix("Y")
results, updates = theano.scan(lambda x_i: ((x_i - Y)**2/(x_i+Y)).sum(axis=1)/2., sequences=X)
chi_square_norm = theano.function(inputs=[X, Y], outputs=[results])
chi_square_norm(A,B) # same result

sparse 3d matrix/array in Python?

In scipy, we can construct a sparse matrix using scipy.sparse.lil_matrix() etc. But the matrix is in 2d.
I am wondering if there is an existing data structure for sparse 3d matrix / array (tensor) in Python?
p.s. I have lots of sparse data in 3d and need a tensor to store / perform multiplication. Any suggestions to implement such a tensor if there's no existing data structure?
Happy to suggest a (possibly obvious) implementation of this, which could be made in pure Python or C/Cython if you've got time and space for new dependencies, and need it to be faster.
A sparse matrix in N dimensions can assume most elements are empty, so we use a dictionary keyed on tuples:
class NDSparseMatrix:
def __init__(self):
self.elements = {}
def addValue(self, tuple, value):
self.elements[tuple] = value
def readValue(self, tuple):
try:
value = self.elements[tuple]
except KeyError:
# could also be 0.0 if using floats...
value = 0
return value
and you would use it like so:
sparse = NDSparseMatrix()
sparse.addValue((1,2,3), 15.7)
should_be_zero = sparse.readValue((1,5,13))
You could make this implementation more robust by verifying that the input is in fact a tuple, and that it contains only integers, but that will just slow things down so I wouldn't worry unless you're releasing your code to the world later.
EDIT - a Cython implementation of the matrix multiplication problem, assuming other tensor is an N Dimensional NumPy array (numpy.ndarray) might look like this:
#cython: boundscheck=False
#cython: wraparound=False
cimport numpy as np
def sparse_mult(object sparse, np.ndarray[double, ndim=3] u):
cdef unsigned int i, j, k
out = np.ndarray(shape=(u.shape[0],u.shape[1],u.shape[2]), dtype=double)
for i in xrange(1,u.shape[0]-1):
for j in xrange(1, u.shape[1]-1):
for k in xrange(1, u.shape[2]-1):
# note, here you must define your own rank-3 multiplication rule, which
# is, in general, nontrivial, especially if LxMxN tensor...
# loop over a dummy variable (or two) and perform some summation:
out[i,j,k] = u[i,j,k] * sparse((i,j,k))
return out
Although you will always need to hand roll this for the problem at hand, because (as mentioned in code comment) you'll need to define which indices you're summing over, and be careful about the array lengths or things won't work!
EDIT 2 - if the other matrix is also sparse, then you don't need to do the three way looping:
def sparse_mult(sparse, other_sparse):
out = NDSparseMatrix()
for key, value in sparse.elements.items():
i, j, k = key
# note, here you must define your own rank-3 multiplication rule, which
# is, in general, nontrivial, especially if LxMxN tensor...
# loop over a dummy variable (or two) and perform some summation
# (example indices shown):
out.addValue(key) = out.readValue(key) +
other_sparse.readValue((i,j,k+1)) * sparse((i-3,j,k))
return out
My suggestion for a C implementation would be to use a simple struct to hold the indices and the values:
typedef struct {
int index[3];
float value;
} entry_t;
you'll then need some functions to allocate and maintain a dynamic array of such structs, and search them as fast as you need; but you should test the Python implementation in place for performance before worrying about that stuff.
An alternative answer as of 2017 is the sparse package. According to the package itself it implements sparse multidimensional arrays on top of NumPy and scipy.sparse by generalizing the scipy.sparse.coo_matrix layout.
Here's an example taken from the docs:
import numpy as np
n = 1000
ndims = 4
nnz = 1000000
coords = np.random.randint(0, n - 1, size=(ndims, nnz))
data = np.random.random(nnz)
import sparse
x = sparse.COO(coords, data, shape=((n,) * ndims))
x
# <COO: shape=(1000, 1000, 1000, 1000), dtype=float64, nnz=1000000>
x.nbytes
# 16000000
y = sparse.tensordot(x, x, axes=((3, 0), (1, 2)))
y
# <COO: shape=(1000, 1000, 1000, 1000), dtype=float64, nnz=1001588>
Have a look at sparray - sparse n-dimensional arrays in Python (by Jan Erik Solem). Also available on github.
Nicer than writing everything new from scratch may be to use scipy's sparse module as far as possible. This may lead to (much) better performance. I had a somewhat similar problem, but I only had to access the data efficiently, not perform any operations on them. Furthermore, my data were only sparse in two out of three dimensions.
I have written a class that solves my problem and could (as far as I think) easily be extended to satisfiy the OP's needs. It may still hold some potential for improvement, though.
import scipy.sparse as sp
import numpy as np
class Sparse3D():
"""
Class to store and access 3 dimensional sparse matrices efficiently
"""
def __init__(self, *sparseMatrices):
"""
Constructor
Takes a stack of sparse 2D matrices with the same dimensions
"""
self.data = sp.vstack(sparseMatrices, "dok")
self.shape = (len(sparseMatrices), *sparseMatrices[0].shape)
self._dim1_jump = np.arange(0, self.shape[1]*self.shape[0], self.shape[1])
self._dim1 = np.arange(self.shape[0])
self._dim2 = np.arange(self.shape[1])
def __getitem__(self, pos):
if not type(pos) == tuple:
if not hasattr(pos, "__iter__") and not type(pos) == slice:
return self.data[self._dim1_jump[pos] + self._dim2]
else:
return Sparse3D(*(self[self._dim1[i]] for i in self._dim1[pos]))
elif len(pos) > 3:
raise IndexError("too many indices for array")
else:
if (not hasattr(pos[0], "__iter__") and not type(pos[0]) == slice or
not hasattr(pos[1], "__iter__") and not type(pos[1]) == slice):
if len(pos) == 2:
result = self.data[self._dim1_jump[pos[0]] + self._dim2[pos[1]]]
else:
result = self.data[self._dim1_jump[pos[0]] + self._dim2[pos[1]], pos[2]].T
if hasattr(pos[2], "__iter__") or type(pos[2]) == slice:
result = result.T
return result
else:
if len(pos) == 2:
return Sparse3D(*(self[i, self._dim2[pos[1]]] for i in self._dim1[pos[0]]))
else:
if not hasattr(pos[2], "__iter__") and not type(pos[2]) == slice:
return sp.vstack([self[self._dim1[pos[0]], i, pos[2]]
for i in self._dim2[pos[1]]]).T
else:
return Sparse3D(*(self[i, self._dim2[pos[1]], pos[2]]
for i in self._dim1[pos[0]]))
def toarray(self):
return np.array([self[i].toarray() for i in range(self.shape[0])])
I also need 3D sparse matrix for solving the 2D heat equations (2 spatial dimensions are dense, but the time dimension is diagonal plus and minus one offdiagonal.) I found this link to guide me. The trick is to create an array Number that maps the 2D sparse matrix to a 1D linear vector. Then build the 2D matrix by building a list of data and indices. Later the Number matrix is used to arrange the answer back to a 2D array.
[edit] It occurred to me after my initial post, this could be handled better by using the .reshape(-1) method. After research, the reshape method is better than flatten because it returns a new view into the original array, but flatten copies the array. The code uses the original Number array. I will try to update later.[end edit]
I test it by creating a 1D random vector and solving for a second vector. Then multiply it by the sparse 2D matrix and I get the same result.
Note: I repeat this many times in a loop with exactly the same matrix M, so you might think it would be more efficient to solve for inverse(M). But the inverse of M is not sparse, so I think (but have not tested) using spsolve is a better solution. "Best" probably depends on how large the matrix is you are using.
#!/usr/bin/env python3
# testSparse.py
# profhuster
import numpy as np
import scipy.sparse as sM
import scipy.sparse.linalg as spLA
from array import array
from numpy.random import rand, seed
seed(101520)
nX = 4
nY = 3
r = 0.1
def loadSpNodes(nX, nY, r):
# Matrix to map 2D array of nodes to 1D array
Number = np.zeros((nY, nX), dtype=int)
# Map each element of the 2D array to a 1D array
iM = 0
for i in range(nX):
for j in range(nY):
Number[j, i] = iM
iM += 1
print(f"Number = \n{Number}")
# Now create a sparse matrix of the "stencil"
diagVal = 1 + 4 * r
offVal = -r
d_list = array('f')
i_list = array('i')
j_list = array('i')
# Loop over the 2D nodes matrix
for i in range(nX):
for j in range(nY):
# Recall the 1D number
iSparse = Number[j, i]
# populate the diagonal
d_list.append(diagVal)
i_list.append(iSparse)
j_list.append(iSparse)
# Now, for each rectangular neighbor, add the
# off-diagonal entries
# Use a try-except, so boundry nodes work
for (jj,ii) in ((j+1,i),(j-1,i),(j,i+1),(j,i-1)):
try:
iNeigh = Number[jj, ii]
if jj >= 0 and ii >=0:
d_list.append(offVal)
i_list.append(iSparse)
j_list.append(iNeigh)
except IndexError:
pass
spNodes = sM.coo_matrix((d_list, (i_list, j_list)), shape=(nX*nY,nX*nY))
return spNodes
MySpNodes = loadSpNodes(nX, nY, r)
print(f"Sparse Nodes = \n{MySpNodes.toarray()}")
b = rand(nX*nY)
print(f"b=\n{b}")
x = spLA.spsolve(MySpNodes.tocsr(), b)
print(f"x=\n{x}")
print(f"Multiply back together=\n{x * MySpNodes}")
I needed a 3d look up table for x,y,z and came up with this solution..
Why not use one of the dimensions to be a divisor of the third dimension? ie. use x and 'yz' as the matrix dimensions
eg. if x has 80 potential members, y has 100 potential' and z has 20 potential'
you make the sparse matrix to be 80 by 2000 (i.e. xy=100x20)
x dimension is as usual
yz dimension: the first 100 elements will represent z=0, y=0 to 99
..............the second 100 will represent z=2, y=0 to 99 etc
so given element located at (x,y,z) would be in sparse matrix at (x, z*100 + y)
if you need to use negative numbers design a aritrary offset into your matrix translation. the solutio could be expanded to n dimensions if necessary
from scipy import sparse
m = sparse.lil_matrix((100,2000), dtype=float)
def add_element((x,y,z), element):
element=float(element)
m[x,y+z*100]=element
def get_element(x,y,z):
return m[x,y+z*100]
add_element([3,2,4],2.2)
add_element([20,15,7], 1.2)
print get_element(0,0,0)
print get_element(3,2,4)
print get_element(20,15,7)
print " This is m sparse:";print m
====================
OUTPUT:
0.0
2.2
1.2
This is m sparse:
(3, 402L) 2.2
(20, 715L) 1.2
====================

Categories