I have one matrix and one vector, of dimensions (N, d) and (N,) respectively. For each row, I want to divide each element by the corresponding value in the vector. I was wondering if there was a vectorized implementation (to save computation time). (I'm trying to create points on the surface of a d-dimensional sphere.) Right now I'm doing this:
x = np.random.randn(N,d)
norm = np.linalg.norm(x, axis=1)
for i in range(N):
for j in range(d):
x[i][j] = x[i][j] / norm[i]
np.linalg.norm has a keepdims argument just for this:
x /= np.linalg.norm(x, axis=1, keepdims=True)
Related
Given an n-by-n matrix A, where each row of A is a permutation of [n], e.g.,
import torch
n = 100
AA = torch.rand(n, n)
A = torch.argsort(AA, dim=1)
Also given another n-by-n matrix P, we want to construct a 3D tensor Q s.t.
Q[i, j, k] = P[A[i, j], k]
Is there any efficient way in pytorch?
I am aware of torch.gather but it seems hard to be directly applied here.
You can directly use:
Q = P[A]
Why not simply use A as an index:
Q = P[A, :]
I'm currently trying to fill a matrix K where each entry in the matrix is just a function applied to two entries of an array x.
At the moment I'm using the most obvious method of running through rows and columns one at a time using a double for-loop:
K = np.zeros((x.shape[0],x.shape[0]), dtype=np.float32)
for i in range(x.shape[0]):
for j in range(x.shape[0]):
K[i,j] = f(x[i],x[j])
While this works fine the resulting matrix is a 10,000 by 10,000 matrix and takes very long to calculate. I was wondering if there is a more efficient way to do this built into NumPy?
EDIT: The function in question here is a gaussian kernel:
def gaussian(a,b,sigma):
vec = a-b
return np.exp(- np.dot(vec,vec)/(2*sigma**2))
where I set sigma in advance before calculating the matrix.
The array x is an array of shape (10000, 8). So the scalar product in the gaussian is between two vectors of dimension 8.
You can use a single for loop together with broadcasting. This requires to change the implementation of the gaussian function to accept 2D inputs:
def gaussian(a,b,sigma):
vec = a-b
return np.exp(- np.sum(vec**2, axis=-1)/(2*sigma**2))
K = np.zeros((x.shape[0],x.shape[0]), dtype=np.float32)
for i in range(x.shape[0]):
K[i] = gaussian(x[i:i+1], x)
Theoretically you could accomplish this even without any for loop, again by using broadcasting, but here an intermediary array of size len(x)**2 * x.shape[1] will be created which might run out of memory for your array sizes:
K = gaussian(x[None, :, :], x[:, None, :])
I'm implementing the inverse power method to find the maximum eigenvalue of a matrix.Given a matrix $A$ ( n by n matrix ) and a vector $x$ a np.array with shape: (len(A),): One of the steps of the implementation involves computing this value:
$q = x^TAx$
The thing is, i don't know if i'm implementing this the rigt way:
q = x.transpose() # A # x
Is there a better way to compute this?
x = x.reshape(-1, 1) # or x = x.reshape(1, -1) if x is a row vector.
A_times_x = np.matmul(A, x)
q = np.matmul(x.T, A)
instead of np.matmul, you should be able to use np.dot as well, but the former is more readable in my opinion.
I have a matrix of counts,
import numpy as np
x = np.array([[ 1,2,3],[1,4,6],[2,3,7]])
And I need the percentages of the total along axis = 1:
for i in range(x.shape[0]):
for j in range(x.shape[1]):
x[i,j] = x[i,j] / np.sum(x[i,:])
In numpy broadcast form.
Currently, I have:
x_sums = np.sum(x,axis=1)
for j in range(x.shape[1]):
x[:,j] = x[:,j] / x_sums[:]
Which puts most of the complexity in numpy code...but a numpy one liner would be best.
Also,
def percentages(a):
return a / np.sum(a)
x_percentages = np.apply_along_axis(percentages,1,x)
But that still involves python.
np.linalg.norm
Is very close, in terms of what is going on, but they only have the 8 hardcoded norms, which does not include percentage of total.
Then there is np.percentile, which is again close...but it is computing the sorted percentile.
x /= x.sum(axis=1, keepdims=True)
Altough x should have a floating point dtype for this to work correctly.
Better may be:
x = np.true_divide(x, x.sum(axis=1, keepdims=True))
Could this be what you are after:
print (x.T/np.sum(x, axis=1)).T
here is my problem:
I have two sets of 3d points. Lets call them "Gausspoints" and "XYZ". I define a function which is a sum of Gaussians in which every Gaussian is centered at one of the Gausspoints. Now I want to evaluate this function on the XYZ points: My approach is working fine but it is rather slow. Any idea how to speed it up by exploiting numpy a little better?
def sumgaus(r):
t=r-Gausspoints
t=map(np.linalg.norm,t)
t = -np.power(t,2.0)
t=np.exp(t)
res=np.sum(t)
return res
result=map(sumgaus,XYZ)
Thanks for any help
Edit:
shape of XYZ N*3 and Gausspoints are M*3 with M, N being different integers
Edit2: I want to apply the following function on each item in XYZ
The tricky part is how to vectorize the computation of all the differences between your points without any explicit Python looping or mapping. You can roll out your own implementation using broadcasting by doing something like:
dist2 = XYZ[:, np.newaxis, :] - Gausspoints
dist2 *= dist
dist2 = np.sum(dist, axis=-1)
And if XYZ has shape (n, 3) and Gausspoints has shape (m, 3), then dist will have shape (n, m), with dist[i, j] being the distance between points XYZ[i] and Gausspoints[j].
It may be easier to understand using scipy.spatial.distance.cdist:
from scipy.spatial.distance import cdist
dist2 = cdist(XYZ, Gausspoints)
dist2 *= dist2
But once you have your array of squared distances, it's child's play:
f = np.sum(np.exp(-dist2), axis=1)