Suppose I have an n x 1 column vector v, and an n x m matrix M. I'm looking for a method to subtract v from every column of M without a loop in Numpy. How can I do this?
I've searched the web and I can't find a method to do this.
Besides searching the web most of the time it is useful to just play around with the arrays and see what works. In you case it is really straight forward:
import numpy as np
n, m = 13, 17
v = np.random.random((n, 1))
M = np.random.random((n, m))
res = M - v
This is also a good resource to get familiar with the basic concepts of numpy.
The normal way to map a function in a numpy.narray like np.array[map(some_func,x)] or vectorize(f)(x) can't provide an index.
The following code is just a simple example that is commonly seen in many applications.
dis_mat = np.zeros([feature_mat.shape[0], feature_mat.shape[0]])
for i in range(feature_mat.shape[0]):
for j in range(i, feature_mat.shape[0]):
dis_mat[i, j] = np.linalg.norm(
feature_mat[i, :] - feature_mat[j, :]
)
dis_mat[j, i] = dis_mat[i, j]
Is there a way to speed it up?
Thank you for your help! The quickest way to speed up this code is this, using the function that #user2357112 commented about:
from scipy.spatial.distance import pdist,squareform
dis_mat = squareform(pdist(feature_mat))
#Julien's method is also good if feature_mat is small, but when the feature_mat is 1000 by 2000, then it needs nearly 40 GB of memory.
SciPy comes with a function specifically to compute the kind of pairwise distances you're computing. It's scipy.spatial.distance.pdist, and it produces the distances in a condensed format that basically only stores the upper triangle of the distance matrix, but you can convert the result to square form with scipy.spatial.distance.squareform:
from scipy.spatial.distance import pdist, squareform
distance_matrix = squareform(pdist(feature_mat))
This has the benefit of avoiding the giant intermediate arrays required with a direct vectorized solution, so it's faster and works on larger inputs. It loses the timing to an approach that uses algebraic manipulations to have dot handle the heavy lifting, though.
pdist also supports a wide variety of alternate distance metrics, if you decide you want something other than Euclidean distance.
# Manhattan distance!
distance_matrix = squareform(pdist(feature_mat, 'cityblock'))
# Cosine distance!
distance_matrix = squareform(pdist(feature_mat, 'cosine'))
# Correlation distance!
distance_matrix = squareform(pdist(feature_mat, 'correlation'))
# And more! Check out the docs.
You can create a new axis and broadcast:
dis_mat = np.linalg.norm(feature_mat[:,None] - feature_mat, axis=-1)
Timing:
feature_mat = np.random.rand(100,200)
def a():
dis_mat = np.zeros([feature_mat.shape[0], feature_mat.shape[0]])
for i in range(feature_mat.shape[0]):
for j in range(i, feature_mat.shape[0]):
dis_mat[i, j] = np.linalg.norm(
feature_mat[i, :] - feature_mat[j, :]
)
dis_mat[j, i] = dis_mat[i, j]
def b():
dis_mat = np.linalg.norm(feature_mat[:,None] - feature_mat, axis=-1)
%timeit a()
100 loops, best of 3: 20.5 ms per loop
%timeit b()
100 loops, best of 3: 11.8 ms per loop
Factor what can be done, and use np.dot optimizations on k x k matrix, in little memory place (kxk):
def c(m):
xy=np.dot(m,m.T) # O(k^3)
x2=y2=(m*m).sum(1) #O(k^2)
d2=np.add.outer(x2,y2)-2*xy #O(k^2)
d2.flat[::len(m)+1]=0 # Rounding issues
return np.sqrt(d2) # O (k^2)
And for comparison:
def d(m):
return squareform(pdist(m))
Here are the 'time(it)' for a k*k initial matrices:
The two algorithms are O(k^3), but c(m) makes the O(k^3) part of the job through np.dot, the critical node of linear algebra which benefits of all optimizations (multicore and so on). pdist is just loops as seen in the source.
This explains the 15x factor for big arrays, even if pdist exploits the symmetry of the matrix by calculating only half of the terms.
One way I thought of to avoid mixing NumPy and for loops would be to create an index array using a version of this index creator that allows for replacement:
import numpy as np
from itertools import product, chain
from scipy.special import comb
def comb_index(n, k):
count = comb(n, k, exact=True, repetition=True)
index = np.fromiter(chain.from_iterable(product(range(n), repeat=k)),
int, count=count*k)
return index.reshape(-1, k)
Then, we simply take each of those array couples, compute the difference between them, reshape the resulting array, and take the norm of each of the rows of the array:
reshape_mat = np.diff(feature_mat[comb_index(feature_mat.shape[0], 2), :], axis=1).reshape(-1, feature_mat.shape[1])
dis_list = np.linalg.norm(reshape_mat, axis=-1)
Note that dis_list is simply a list of all of the n*(n+1)/2 possible norms. This runs at close to the same speed as the other answer for the feature_mat he provided, and when comparing the byte sizes of our largest sections,
(feature_mat[:,None] - feature_mat).nbytes == 16000000
while
np.diff(feature_mat[comb_index(feature_mat.shape[0], 2), :], axis=1).reshape(-1, feature_mat.shape[1]).nbytes == 8080000
For most inputs, mine uses only half the storage: still unoptimal, but a marginal improvement.
Based on np.triu_indices, in case you really want to do this with pure NumPy:
s = feature_mat.shape[0]
i, j = np.triu_indices(s, 1) # All possible combinations of indices
dist_mat = np.empty((s, s)) # Don't waste time filling with zeros
np.einsum('ii->i', dist_mat)[:] = 0 # When you can just fill the diagonal
dist_mat[i, j] = dist_mat[j, i] = np.linalg.norm(feature_mat[i] - feature_mat[j], axis=-1)
# Vectorized version of your original process
The benefit of this method over broadcasting is that you can do it in chunks:
n = 10000000 # Based on your RAM available
for k in range (0, i.size, n):
i_ = i[k: k + n]
j_ = j[k: k + n]
dist_mat[i_, j_] = dist_mat[j_, i_] = \
np.linalg.norm(feature_mat[i_] - feature_mat[j_], axis = -1)
Let's begin by rewriting this in terms of a function:
dist(mat, i, j):
return np.linalg.norm(mat[i, :] - mat[j, :])
size = feature_mat.shape[0]
for i in range(size):
for j in range(size):
dis_mat[i, j] = dist(feature_mat, i, j)
This can be rewritten in (a slightly more) vectorized form as:
v = [dist(feature_map, i, j) for i in range(size) for j in range(size)]
dist_mat = np.array(v).reshape(size, size)
Notice that we're still relying on Python rather than NumPy for some of the computation, but it's a step towards vectorization. Also notice that dist(i, j) is symmetric, so we could further reduce computations by approximately half. Perhaps considering:
v = [dist(feature_map, i, j) for i in range(size) for j in range(i + 1)]
Now the tricky bit is assigning these computed values to the correct elements in a dist_mat.
How fast this performs depends on the cost of computing dist(i, j). For small feature_mats, the cost of recomputing is not high enough to worry about this. But for large matrices, you definitely do not want to recompute.
I want to do a series of dot products. Namely
for i in range(N[0]):
for j in range(N[1]):
kr[i,j] = dot(k[i,j,:], r[i,j,:])
Is there a vectorized way to do this, for example using einsum or tensordot?
Assuming N[0] and N[1] are the lengths of the first two dimensions of k and r,
kr = numpy.einsum('...i,...i->...', k, r)
We specify ... to enable broadcasting, and perform a dot product along the last axis.
Assuming k and r have three dimensions, this is the same as:
kr = numpy.sum(k * r, axis=-1)
I have two matrices A, B, NxKxD dimensions and I want get matrix C, NxKxDxD dimensions, where C[n, k] = A[n, k] x B[n, k].T (here "x" means product of matrices of dimensions Dx1 and 1xD, so the result must be DxD dimensional), so now my code looking like this (here A = B = X):
def square(X):
out = np.zeros((N, K, D, D))
for n in range(N):
for k in range(K):
out[n, k] = np.dot(X[n, k, :, np.newaxis], X[n, k, np.newaxis, :])
return out
It may be slow for big N and K because of python's for cycle. Is there some way to make this multiplication in one numpy function?
It seems you are not using np.dot for sum-reduction, but just for expansion that results in broadcasting. So, you can simply extend the array to have one more dimension with the use of np.newaxis/None and let the implicit broadcasting help out.
Thus, an implementation would be -
X[...,None]*X[...,None,:]
More info on broadcasting specifically how to add new axes could be found in this other post.
This is probably obvious on reflection, but it's not clear to me right now.
For a pair of numpy arrays of shapes (K, N, M) and (K, M, N) denoted by a and b respectively, is there a way to compute the following as a single vectorized operation:
import numpy as np
K = 5
N = 2
M = 3
a = np.random.randn(K, N, M)
b = np.random.randn(K, M, N)
output = np.empty((K, N, N))
for each_a, each_b, each_out in zip(a, b, output):
each_out[:] = each_a.dot(each_b)
A simple a.dot(b) returns the dot product for every pair of the first axis (so it returns an array of shape (K, N, K, N).
edit: fleshed out the code a bit for those that couldn't understand the question.
I answered a similar question a while back: Element-wise matrix multiplication in NumPy .
I think what you're looking for is:
output = np.einsum('ijk,ikl->ijl', a, b)
Good luck!