PyTorch Generating Matrix using a Kernel Function without For-Loops

PyTorch Generating Matrix using a Kernel Function without For-Loops - python

I am trying to generate a matrix (tensor object on PyTorch) that is similar to Gram matrix except I need to apply a kernel function instead of inner product on my input matrix.
For loops like the one below works:
N = x.shape[0] # x.shape = (N,d)
G = torch.zeros((N,N))
for i in range(N):
for j in range(N):
G[i][j] = K(x[i], x[j])
where x is my input tensor whose shape is (N,d) and the kernel function K(a,b) yields a real value after performing some math. For example:
def K(a,b):
return ((1+(a*b)).sum()).pow(2) #second degree polynomial.
I want to generate this matrix, G without having to change the kernel function K() and of course, without for-loops!
My initial attempt is to use a lambda approach but this code below obviously doesn't work as it only yields a list of k(x[i],x[i]).
G = torch.tensor(list(map(lambda a,b: K(a,b),x,x))
How can I use the lambda function to yield N-by-N matrix?
What would be some other ways to tackle this problem?
Any insight would be appreciated.

You can calculate G from x simply with:
G = (1 + torch.matmul(x, x.T)).pow(2)

Related

Efficient way to fill NumPy array for independent entries?

I'm currently trying to fill a matrix K where each entry in the matrix is just a function applied to two entries of an array x.
At the moment I'm using the most obvious method of running through rows and columns one at a time using a double for-loop:
K = np.zeros((x.shape[0],x.shape[0]), dtype=np.float32)
for i in range(x.shape[0]):
for j in range(x.shape[0]):
K[i,j] = f(x[i],x[j])
While this works fine the resulting matrix is a 10,000 by 10,000 matrix and takes very long to calculate. I was wondering if there is a more efficient way to do this built into NumPy?
EDIT: The function in question here is a gaussian kernel:
def gaussian(a,b,sigma):
vec = a-b
return np.exp(- np.dot(vec,vec)/(2*sigma**2))
where I set sigma in advance before calculating the matrix.
The array x is an array of shape (10000, 8). So the scalar product in the gaussian is between two vectors of dimension 8.

You can use a single for loop together with broadcasting. This requires to change the implementation of the gaussian function to accept 2D inputs:
def gaussian(a,b,sigma):
vec = a-b
return np.exp(- np.sum(vec**2, axis=-1)/(2*sigma**2))
K = np.zeros((x.shape[0],x.shape[0]), dtype=np.float32)
for i in range(x.shape[0]):
K[i] = gaussian(x[i:i+1], x)
Theoretically you could accomplish this even without any for loop, again by using broadcasting, but here an intermediary array of size len(x)**2 * x.shape[1] will be created which might run out of memory for your array sizes:
K = gaussian(x[None, :, :], x[:, None, :])

How to multiply variable matrix by coefficient matrix in Python, using Gurobi

I want to multiply both matrix's below and set as objective for my model:
m = gp.Model("matrix")
x = m.addMVar((9, 9), vtype=GRB.BINARY, name="x")
c = np.random.rand(9,9)
m.setObjective(x # c, GRB.MINIMIZE)
Here's what am trying to achieve
This gives me following error:
Error code -1: Variable is not a 1D MVar object
How can i solve that? I suppose Gurobi doesn't accept 2D Mvar object multiplication

As already mentioned in the comments, note that the product of two matrices is again a matrix and the evaluated objective needs to be a scalar, so this is probably not what you want to do. According to your picture, your objective is a simple linear expression, not a matrix product. Hence, it's much easier to use Gurobi's algebraic modelling interface, i.e. Vars instead of MVars:
import gurobipy as gp
from gurobipy import GRB, quicksum as qsum
import numpy as np
M, N = 9, 9
m = gp.Model("matrix")
x = m.addVars(M, N, vtype="B", name="x")
c = np.random.rand(M, N)
m.setObjective(qsum(c[i,j]*x[i,j] for i in range(M) for j in range(N)), GRB.MINIMIZE)

Vectorizing or boosting time for an interpolation in Python

I have to boost the time for an interpolation over a large (NxMxT) matrix MTR, where:
N is about 8000;
M is about 10000;
T represents the number of times at which each NxM matrix is calculated (in my case it's 23).
I have to compute the interpolation element-wise, on all the T different times, and return the interpolated values over a different array of times (T_interp, in my case with lenght 47) so, as output, I want an NxMxT_interp matrix.
The code snippet below defines the function I built for the interpolation, using scipy.interpolate.Rbf (y is the array MTR[i,j,:], x is the times array with length T, x_interp is the new array of times with length T_interp:
#==============================================================================
# Interpolate without nans
#==============================================================================
def interp(x,y,x_interp,**kwargs):
import numpy as np
from scipy.interpolate import Rbf
mask = np.isnan(y)
y_mask = np.ma.array(y,mask = mask)
x_new = [x[i] for i in np.where(~mask)[0]]
if len(y_mask.compressed()) == 0:
return [np.nan for i,n in enumerate(x_interp)]
elif len(y_mask.compressed()) == 1:
return [y_mask.compressed() for i,n in enumerate(x_interp)]
interp = Rbf(x_new,y_mask.compressed(),**kwargs)
y_interp = interp(x_interp)
return y_interp
I tried to achieve my goal either by looping over the NxM elements of the MTR matrix:
new_MTR = np.empty((N,M,T_interp))
for i in range(N):
for j in range(M):
new_MTR[i,j,:]=interp(times,MTR[i,j,:],New_times,function = 'linear')
or by using the np.apply_along_axis funtion:
new_MTR = np.apply_along_axis(lambda x: interp(times,x,New_times,function = 'linear'),2,MTR)
In both cases I extimated the time it takes to perform the whole operation and it appears to be slightly better for the np.apply_along_axis function, but still it will take about 15 hours!!
Is there a way to reduce this time? Maybe by vectorizing the entire operation? I don't know much about vectorizing and how it can be done in a situation like mine so any help would be much appreciated. Thank you!

Dot product of patches in tensorflow

I have two square matrices of the same size and the dimensions of a square patch. I'd like to compute the dot product between every pair of patches. Essentially I would like to implement the following operation:
def patch_dot(A, B, patch_dim):
res_dim = A.shape[0] - patch_dim + 1
res = np.zeros([res_dim, res_dim, res_dim, res_dim])
for i in xrange(res_dim):
for j in xrange(res_dim):
for k in xrange(res_dim):
for l in xrange(res_dim):
res[i, j, k, l] = (A[i:i + patch_dim, j:j + patch_dim] *
B[k:k + patch_dim, l:l + patch_dim]).sum()
return res
Obviously this would be an extremely inefficient implementation. Tensorflow's tf.nn.conv2d seems like a natural solution to this as I'm essentially doing a convolution, however my filter matrix isn't fixed. Is there a natural solution to this in Tensorflow, or should I start looking at implementing my own tf-op?

The natural way to do this is to first extract overlapping image patches of matrix B using tf.extract_image_patches, then to apply the tf.nn.conv2D function on A and each B sub-patch using tf.map_fn.
Note that prior to use tf.extract_image_patches and tf.nn.conv2D you need to reshape your matrices as 4D tensors of shape [1, width, height, 1] using tf.reshape.
Also, prior to use tf.map_fn, you would also need to use the tf.transpose op so that the B sub-patches are indexed by the first dimension of the tensor you use as the elems argument of tf.map_fn.

Computing covariance matrix using numpy and generators

Numpy has the function to compute covariance from an array which is fine. However, I would like to do it using generators to save memory. Is there some way to do this without writing my own cov-function?

You can use the following implementation:
from numpy import outer
def gen_cov(g):
mean, covariance = 0, 0
for i, x in enumerate(g):
diff = x - mean
mean += diff/(i+1)
covariance += outer(diff, diff) * i / (i+1)
return covariance/i
You may want to use something different from numpy.outer depending on what the generator elements are. This is a Python implementation of this answer.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

PyTorch Generating Matrix using a Kernel Function without For-Loops - python

You can calculate G from x simply with: G = (1 + torch.matmul(x, x.T)).pow(2)

Related

Efficient way to fill NumPy array for independent entries?

How to multiply variable matrix by coefficient matrix in Python, using Gurobi

Vectorizing or boosting time for an interpolation in Python

Dot product of patches in tensorflow

Computing covariance matrix using numpy and generators

Categories

Resources