How to vectorize this batched-pairwise computation in PyTorch? - python

I want to write a batched pairwise bi-variate moran's I. The formula can be found here.
If X and Y are both (n,n) then the weight matrix W has dimension (n^2, n^2).
I think I have a vectorization for a dumb toy example with 1 pair as follows. Note that you have to flatten and standardize x,y.
n_elts = 4
x = torch.FloatTensor([0,1,1,0])
y = torch.FloatTensor([0,1,1,0])
w = torch.FloatTensor(np.identity(n_elts))
x = x - torch.mean(x)
y = y - torch.mean(y)
ans = torch.sum(torch.outer(x,y) * w)/(torch.norm(x)**2) * (n_elts/torch.sum(w)) # = 1
I'm having a hard time extending this to the batched pairwise case. That is, x has shape (B,n,n) and y has shape (C,n,n). You can assume they get flattened to (B,n^2) and (C, n^2), respectively. The output should have shape (B,C). Here B is batch size and C is some number that will generally be different than B.
So far all I can figure out is that, again if x is (B,n^2) and y is (C,n^2) then I can get a broadcasted outer product as follows
at = a[:,None,:,None]
bt = b[None,:,None,:]
outer = at*bt # has shape (B,C,n^2,n^2)

Related

Constrained Linear combination of learned parameters is pytorch?

I have three tensors X,Y,Z and I want to learn the optimal convex combination of these tensors wrt to some cost, i.e.
aX + bY + cZ such that a + b + c = 1. How can I do this easily in Pytorch?
I know that I could just concatenate along an unsqueezed axis and then apply linear layer as so:
X = X.unsqueeze(-1)
Y = Y.unsqueeze(-1)
Z = Z.unsqueeze(-1)
W = torch.cat([X,Y,Z], dim = -1) #third axis has dimension 3)
W = torch.linear(3,1)(W)
but this would not apply the convex combination constraint...
I found an answer that works well for those who are interested this would generalize to a linear combination of N tensors you just need to change the weights dim and number of tensors you concatenate.
weights = nn.Parameter(torch.rand(1,3))
X = X.unsqueeze(-1)
Y = Y.unsqueeze(-1)
Z = Z.unsqueeze(-1)
weights_normalized = nn.functional.softmax(weights, dim=-1)
output = torch.matmul(torch.cat([X, Y, Z], dim=-1), weights_normalized.t()).squeeze()

How to take advantage of vectorization when computing the pdf for a multivariate gaussian?

I've been spending a few hours googling about this problem and it seems I can't find any information.
I tried coding a multivariate gaussian pdf as:
def multivariate_normal(X, M, S):
# X has shape (D, N) where D is the number of dimensions and N the number of observations
# M is the mean vector with shape (D, 1)
# S is the covariance matrix with shape (D, D)
D = S.shape[0]
S_inv = np.linalg.inv(S)
logdet = np.log(np.linalg.det(S))
log2pi = np.log(2*np.pi)
devs = X - M
a = np.array([- D/2 * log2pi - (1/2) * logdet - dev.T # S_inv # dev for dev in devs.T])
return np.exp(a)
I've only been successful in computing the pdf through a for loop, iterating N times. If I don't, I end up with an (N, N) matrix which is unhelpful. I've found another post here, but the post is quite outdated and in matlab.
Is there anyway to take advantage of numpy's vectorisation?
This is my first post on stackoverflow, let me know if anything is off!d
I came across this problem in a similar manner and here's how I solved it:
Variables:
X = numpy.ndarray[numpy.ndarray[float]] - m x n
MU = numpy.ndarray[numpy.ndarray[float]] - k x n
SIGMA = numpy.ndarray[numpy.ndarray[numpy.ndarray[float]]] - k x n x n
k = int
Where X is my feature vector, MU is my means, SIGMA is my covariance matrix.
To vectorize, I rewrote the dot product per the definition of the dot-product:
sigma_det = np.linalg.det(sigma)
sigma_inv = np.linalg.inv(sigma)
const = 1/((2*np.pi)**(n/2)*sigma_det**(1/2))
p = const*np.exp((-1/2)*np.sum((X-mu).dot(sigma_inv)*(X-mu),axis=1))
I have been working on this problem for the last few days and finally have come to a solution.
To do so I have added an extra dimension to the x vector, and then used the np.einsum() function for computing the Mahalanobis distance.
Example
For the following example we will use a (100 x 2) input array. That is, 100 samples of two random variables. That gives us a (1 x 2) mean vector and a (2 x 2) covariance matrix.
Generating some data:
# instantiate a random number generator
rng = np.random.default_rng(100)
# define mu and sigma for the dummy sample
mu = np.array([0.5, 0.25])
covmat = np.array([[1, 0.5],
[0.5, 1]])
# generate multivariate normal random sample
x = rng.multivariate_normal(mu, covmat, size=100)
And defining the pdf function:
def pdf(x, mu, covmat):
"""
Generates the probability of a given x vector based on the
probability distribution function N(mu, covmat)
Returns: the probability
"""
x = x[:, np.newaxis] # add a new first dimension to x
k = mu.shape[0] # number of dimensions
diff = x - mu # deviation of x from the mean
inv_covmat = np.linalg.inv(covmat)
term1 = (2*np.pi)**-(k/2)*np.linalg.det(inv_covmat)
term2 = np.exp(-np.einsum('ijk, kl, ijl->ij', diff, inv_covmat, diff) / 2)
return term1 * term2
Which returns a (n, 1) array, where n is the number of samples, in this case (100,1).
Explanation
The easiest way to think about solving the problem is just writing down the dimensions, and trying to do the linear algebra.
We need to do some kind of manipulation of three tensors with the following shapes, to get the resulting tensor:
A, B, C -> D
(100 x 1 x 2), (2, 2), (100 x 1 x 2) -> (100 x 1)
Let the first tensor, A, have the indices, ijk:
Then we want to do some operation of A and B to get the shape (100 x 1 x 2).
Hence,
ijk, kl - > ijl
(100 x 1 x 2), (2 x 2) -> (100 x 1 x 2)
This leaves us with AB, C
(100 x 1 x 2), (100 x 1 x 2)
We want D to have the shape (100 x 1)
Hence:
ijl, ijl->ij
(100 x 1 x 2), (100 x 1 x 2) -> (100 x 1)
Putting the two operations together, we get:
ijk, kl, ijl->ij

Is it possible to put a 1D ndarray (size N) into 1D ndarray (size N,1)

I'm trying to put results of a calculus into a big matrix where the last dimension can be 1 or 2 or more.
so to put my result in the matrix I do
res[i,j,:,:] = y
If y is sized (N,2) or more than 2 it is find, but if y is sized (N) I got an error saying:
ValueError: could not broadcast input array from shape (10241) into shape (10241,1)
Small example:
import numpy as np
N=10
y = np.zeros((N,2))
res = np.zeros((2,2,N,2))
res[0,0,:,:]= y
y = np.zeros((N,1))
res = np.zeros((2,2,N,1))
res[0,0,:,:]= y
y = np.zeros(N)
res = np.zeros((2,2,N,1))
res[0,0,:,:]= y
I'm getting the error for the last example but they are both (y and res) 1D vector right?
I'm wondering if it exists a solution to make this assignment whatever the size of the last dimension (1, 2 or more)?
In my code I made an try except but could exist another way
try:
self.res[i,j,:,:] = self.ODE_solver(len(self.t))
except:
self.res[i, j, :, 0] = self.ODE_solver(len(self.t))
For the generic solution that works across all three scenarios, use -
res[0,0,:,:] = y.reshape(y.shape[0],-1)
So, basically, we are making y 2D while keeping the first axis length intact and changing the second one based on the leftover.
You can reshape y to be the last 2 dimensions of res.
N=10
y = np.zeros((N,2))
res = np.zeros((2,2,N,2))
res[0,0,:,:]= y.reshape(res.shape[-2:])
y = np.zeros((N,1))
res = np.zeros((2,2,N,1))
res[0,0,:,:]= y.reshape(res.shape[-2:])
y = np.zeros(N)
res = np.zeros((2,2,N,1))
res[0,0,:,:]= y.reshape(res.shape[-2:])

dot product with diagonal matrix, without creating it full matrix

I'd like to calculate a dot product of two matrices, where one of them is a diagonal matrix. However, I don't want to use np.diag or np.diagflat in order to create the full matrix, but instead use the 1D array directly filled with the diagonal values. Is there any way or numpy operation which I can use for this kind of problem?
x = np.arange(9).reshape(3,3)
y = np.arange(3) # diagonal elements
z = np.dot(x, np.diag(y))
and the solution I'm looking for should be without np.diag
z = x ??? y
Directly multiplying the ndarray by your vector will work. Numpy conveniently assumes that you want to multiply the nth column of x by the nth element of your y.
x = np.random.random((5, 5)
y = np.random.random(5)
diagonal_y = np.diag(y)
z = np.dot(x, diagonal_y)
np.allclose(z, x * y) # Will return True
The Einstein summation is an elegant solution to these kind of problems:
import numpy as np
x = np.random.uniform(0,1, size=5)
w = np.random.uniform(0,1, size=(5, 3))
diagonal_x = np.diagflat(x)
z = np.dot(diagonal_x, w)
zz = np.einsum('i,ij->ij',x , w)
np.allclose(z, zz) # Will return True
See: https://docs.scipy.org/doc/numpy/reference/generated/numpy.einsum.html#numpy.einsum

how to perform coordinates affine transformation using python?

I would like to perform transformation for this example data set.
There are four known points with coordinates x, y, z in one coordinate[primary_system] system and next four known points with coordinates x, y, h that belong to another coordinate system[secondary_system].
Those points correspond; for example primary_system1 point and secondary_system1 point is exactly the same point but we have it's coordinates in two different coordinate systems.
So I have here four pairs of adjustment points and want to transform another point coordinates from primary system to secondary system according to adjustment.
primary_system1 = (3531820.440, 1174966.736, 5162268.086)
primary_system2 = (3531746.800, 1175275.159, 5162241.325)
primary_system3 = (3532510.182, 1174373.785, 5161954.920)
primary_system4 = (3532495.968, 1175507.195, 5161685.049)
secondary_system1 = (6089665.610, 3591595.470, 148.810)
secondary_system2 = (6089633.900, 3591912.090, 143.120)
secondary_system3 = (6089088.170, 3590826.470, 166.350)
secondary_system4 = (6088672.490, 3591914.630, 147.440)
#transform this point
x = 3532412.323
y = 1175511.432
z = 5161677.111<br>
at the moment I try to average translation for x, y and z axis using each of the four pairs of points like:
#x axis
xt1 = secondary_system1[0] - primary_system1[0]
xt2 = secondary_system2[0] - primary_system2[0]
xt3 = secondary_system3[0] - primary_system3[0]
xt4 = secondary_system4[0] - primary_system4[0]
xt = (xt1+xt2+xt3+xt4)/4 #averaging
...and so on for y and z axis
#y axis
yt1 = secondary_system1[1] - primary_system1[1]
yt2 = secondary_system2[1] - primary_system2[1]
yt3 = secondary_system3[1] - primary_system3[1]
yt4 = secondary_system4[1] - primary_system4[1]
yt = (yt1+yt2+yt3+yt4)/4 #averaging
#z axis
zt1 = secondary_system1[2] - primary_system1[2]
zt2 = secondary_system2[2] - primary_system2[2]
zt3 = secondary_system3[2] - primary_system3[2]
zt4 = secondary_system4[2] - primary_system4[2]
zt = (zt1+zt2+zt3+zt4)/4 #averaging
So above I attempted to calculate average translation vector for every axis
If it is just a translation and rotation, then this is a transformation known as an affine transformation.
It basically takes the form:
secondary_system = A * primary_system + b
where A is a 3x3 matrix (since you're in 3D), and b is a 3x1 translation.
This can equivalently be written
secondary_system_coords2 = A2 * primary_system2,
where
secondary_system_coords2 is the vector [secondary_system,1],
primary_system2 is the vector [primary_system,1], and
A2 is the 4x4 matrix:
[ A b ]
[ 0,0,0,1 ]
(See the wiki page for more info).
So basically, you want to solve the equation:
y = A2 x
for A2, where y consist of points from secondary_system with 1 stuck on the end, and x is points from primary_system with 1 stuck on the end, and A2 is a 4x4 matrix.
Now if x was a square matrix we could solve it like:
A2 = y*x^(-1)
But x is 4x1. However, you are lucky and have 4 sets of x with 4 corresponding sets of y, so you can construct an x that is 4x4 like so:
x = [ primary_system1 | primary_system2 | primary_system3 | primary_system4 ]
where each of primary_systemi is a 4x1 column vector. Same with y.
Once you have A2, to transform a point from system1 to system 2 you just do:
transformed = A2 * point_to_transform
You can set this up (e.g. in numpy) like this:
import numpy as np
def solve_affine( p1, p2, p3, p4, s1, s2, s3, s4 ):
x = np.transpose(np.matrix([p1,p2,p3,p4]))
y = np.transpose(np.matrix([s1,s2,s3,s4]))
# add ones on the bottom of x and y
x = np.vstack((x,[1,1,1,1]))
y = np.vstack((y,[1,1,1,1]))
# solve for A2
A2 = y * x.I
# return function that takes input x and transforms it
# don't need to return the 4th row as it is
return lambda x: (A2*np.vstack((np.matrix(x).reshape(3,1),1)))[0:3,:]
Then use it like this:
transformFn = solve_affine( primary_system1, primary_system2,
primary_system3, primary_system4,
secondary_system1, secondary_system2,
secondary_system3, secondary_system4 )
# test: transform primary_system1 and we should get secondary_system1
np.matrix(secondary_system1).T - transformFn( primary_system1 )
# np.linalg.norm of above is 0.02555
# transform another point (x,y,z).
transformed = transformFn((x,y,z))
Note: There is of course numerical error here, and this may not be the best way to solve for the transform (you might be able to do some sort of least squares thing).
Also, the error for converting primary_systemx to secondary_systemx is (for this example) of order 10^(-2).
You'll have to consider whether this is acceptable or not (it does seem large, but it might be acceptable when compared to your input points which are all of order 10^6).
The mapping you are looking for seems to be affine transformation. Four 3D points not lying in one plain is the exact number of points needed to recover the affine transformation. The latter is, loosely speaking, multiplication by matrix and adding a vector
secondary_system = A * primary_system + t
The problem is now reduced to finding appropriate matrix A and vector t. I think, this code may help you (sorry for bad codestyle -- I'm mathematician, not programmer)
import numpy as np
# input data
ins = np.array([[3531820.440, 1174966.736, 5162268.086],
[3531746.800, 1175275.159, 5162241.325],
[3532510.182, 1174373.785, 5161954.920],
[3532495.968, 1175507.195, 5161685.049]]) # <- primary system
out = np.array([[6089665.610, 3591595.470, 148.810],
[6089633.900, 3591912.090, 143.120],
[6089088.170, 3590826.470, 166.350],
[6088672.490, 3591914.630, 147.440]]) # <- secondary system
p = np.array([3532412.323, 1175511.432, 5161677.111]) #<- transform this point
# finding transformation
l = len(ins)
entry = lambda r,d: np.linalg.det(np.delete(np.vstack([r, ins.T, np.ones(l)]), d, axis=0))
M = np.array([[(-1)**i * entry(R, i) for R in out.T] for i in range(l+1)])
A, t = np.hsplit(M[1:].T/(-M[0])[:,None], [l-1])
t = np.transpose(t)[0]
# output transformation
print("Affine transformation matrix:\n", A)
print("Affine transformation translation vector:\n", t)
# unittests
print("TESTING:")
for p, P in zip(np.array(ins), np.array(out)):
image_p = np.dot(A, p) + t
result = "[OK]" if np.allclose(image_p, P) else "[ERROR]"
print(p, " mapped to: ", image_p, " ; expected: ", P, result)
# calculate points
print("CALCULATION:")
P = np.dot(A, p) + t
print(p, " mapped to: ", P)
This code demonstrates how to recover affine transformation as matrix + vector and tests that initial points are mapped to where they should. You can test this code with Google colab, so you don't have to install anything.
Regarding theory behind this code: it is based on equation presented in "Beginner's guide to mapping simplexes affinely", matrix recovery is described in section "Recovery of canonical notation" and number of points needed to pinpoint the exact affine transformation is discussed in "How many points do we need?" section. The same authors published "Workbook on mapping simplexes affinely" that contains many practical examples of this kind.

Categories