Creating an array from two different arrays with dot product - python

Given is X (n,d) array and theta(k,d) array and a parameter temp_parameter.
In my code, I want to create a matrix such that its i,j component is the dot product of the ith row of X and the jth row of theta. I have created this so far
k = theta.shape[0]
n = X.shape[0]
zeros = np.zeros((k,n))
for i in range(n):
for j in range(k):
zeros[j][i]= np.dot(X[i],theta[j])/temp_parameter
However, this is extremely inefficient. Can anybody give me a hint how to improve my code?

Related

Compute cosine similarity against every element of a dask matrix

My goal is to find the Top N vectors in a large 3D dask array(~100k rows per side or more would be nice) that are most cosine similar to a target vector. I can get the Top 1, and only for smaller values of n, n=500 takes over 2 hours. I'm doing something incorrectly with dask, but not sure what. Also, is there a vectorized way to get the cosine similarity instead of the for-loop? In pure numpy I can get to n = ~6000 before I have a MemoryError. dtype of float16 is enough accuracy and an attempt to save space. If dask isn't the right tool, I'd be open to something else too.
import dask.array as da
import numpy as np
from numpy.linalg import norm
# create a 2d matrix of n rows, each of length n, ideally n is quite large, >100,000
start = 1
step = 1
n = 5
vec_len = 10
shape = [n, vec_len]
end = np.prod(shape) * step + start
arr_2D = da.from_array(np.array(np.arange(start, end, step).reshape(shape), dtype=np.float16))
print(arr_2D.compute())
# sum each row with each other row using broadcasting, resulting in a 3D matrix
# each (i,j) location contains a vector that is the sum of the i-th and j-th original vectors
sums_3D = arr_2D[:, None] + arr_2D[None,:]
# make a target vector
target = np.array(range(vec_len,0,-1))
print('target:', target)
# brute force way to get cosine of each vector in #D matrix with target vector
da_cos = da.empty(shape=(n,n), dtype=np.float16)
for i in range(n): # <----- is there a way to vectorize this for loop??
print('row:', i)
for j in range(i+1, n): # i+1: to get only upper triangle
cur = sums_3D[i, j]
cosine = np.dot(target,cur)/(norm(target)*norm(cur))
da_cos[i,j] = cosine
print(da_cos.compute(), da_cos.dtype, da_cos.shape)
# Get top match <------ how would I get the Top N matches??
ar_max = da_cos.argmax().compute()
best_1, best_2 = np.unravel_index(ar_max, (n,n))
print(da_cos.max().compute(), best_1, best_2)

Efficient way to fill NumPy array for independent entries?

I'm currently trying to fill a matrix K where each entry in the matrix is just a function applied to two entries of an array x.
At the moment I'm using the most obvious method of running through rows and columns one at a time using a double for-loop:
K = np.zeros((x.shape[0],x.shape[0]), dtype=np.float32)
for i in range(x.shape[0]):
for j in range(x.shape[0]):
K[i,j] = f(x[i],x[j])
While this works fine the resulting matrix is a 10,000 by 10,000 matrix and takes very long to calculate. I was wondering if there is a more efficient way to do this built into NumPy?
EDIT: The function in question here is a gaussian kernel:
def gaussian(a,b,sigma):
vec = a-b
return np.exp(- np.dot(vec,vec)/(2*sigma**2))
where I set sigma in advance before calculating the matrix.
The array x is an array of shape (10000, 8). So the scalar product in the gaussian is between two vectors of dimension 8.
You can use a single for loop together with broadcasting. This requires to change the implementation of the gaussian function to accept 2D inputs:
def gaussian(a,b,sigma):
vec = a-b
return np.exp(- np.sum(vec**2, axis=-1)/(2*sigma**2))
K = np.zeros((x.shape[0],x.shape[0]), dtype=np.float32)
for i in range(x.shape[0]):
K[i] = gaussian(x[i:i+1], x)
Theoretically you could accomplish this even without any for loop, again by using broadcasting, but here an intermediary array of size len(x)**2 * x.shape[1] will be created which might run out of memory for your array sizes:
K = gaussian(x[None, :, :], x[:, None, :])

Why does this array indexing in numpy work?

I've got some numpy 2d arrays:
x, of shape(N,T)
W, of shape(V,D)
they are described as the following:
"Minibatches of size N where each sequence has length T. We assume a vocabulary of V words, assigning each to a vector of dimension D."(This is a question from cs231 A3.)
I want an output array of shape(N, T, D), where i can match the N elements to the desired vectors.
First I came out with the solution using a loop to run through all the elements in the first row of x:
for n in range(N):
out[n, :, :] = W[x[n, :]]
Then I go on to experiment with the second solution:
out = W[x]
Both solutions gave me the right answer, but why does the second solution work? Why can I index a 3d array in a 2d array?

Handling matrix multiplication in log space in Python

I am implementing a Hidden Markov Model and thus am dealing with very small probabilities. I am handling the underflow by representing variables in log space (so x → log(x)) which has the side effect that multiplication is now replaced by addition and addition is handled via numpy.logaddexp or similar.
Is there an easy way to handle matrix multiplication in log space?
This is the best way I could come up with to do it.
from scipy.special import logsumexp
def log_space_product(A,B):
Astack = np.stack([A]*A.shape[0]).transpose(2,1,0)
Bstack = np.stack([B]*B.shape[1]).transpose(1,0,2)
return logsumexp(Astack+Bstack, axis=0)
The inputs A and B are the logs of the matrices A0 and B0 you want to multiply, and the functions returns the log of A0B0. The idea is that the i,j spot in log(A0B0) is the log of the dot product of the ith row of A0 and the jth column of B0. So it is the logsumexp of the ith row of A plus the jth column of B.
In the code, Astack is built so the i,j spot is a vector containing the ith row of A, and Bstack is built so the i,j spot is a vector containing the jth column of B. Thus Astack + Bstack is a 3D tensor whose i,j spot is the ith row of A plus the jth column of B. Taking logsumexp with axis = 0 then gives the desired result.
Erik's response doesn't seem to work for some non-square matrices (e.g. n*m times m*r). Here is a version that takes that into account:
def log_space_product(A,B):
Astack = np.stack([A]*B.shape[1]).transpose(1,0,2)
Bstack = np.stack([B]*A.shape[0]).transpose(0,2,1)
return logsumexp(Astack+Bstack, axis=2)
where the i, j spot of A contains the i-th row of A and i, j spot of B contains the i-th column of B.
This happens because [A] * B.shape[1] is of shape (r, n, m) which is transposed into (n, r, m), and [B] * A.shape[0] is of shape (n, m, r) which is transposed into (n, r, m). We want their first two dimensions to be (n, r) because the result matrix needs to be of shape (n, r).
Took a while to figure out myself. Hope this helps anyone implementing a HMM!

Creating a sparse matrix from lists of sub matrices (Python)

This is my first SO question ever. Let me know if I could have asked it better :)
I am trying to find a way to splice together lists of sparse matrices into a larger block matrix.
I have python code that generates lists of square sparse matrices, matrix by matrix. In pseudocode:
Lx = [Lx1, Lx1, ... Lxn]
Ly = [Ly1, Ly2, ... Lyn]
Lz = [Lz1, Lz2, ... Lzn]
Since each individual Lx1, Lx2 etc. matrix is computed sequentially, they are appended to a list--I could not find a way to populate an array-like object "on the fly".
I am optimizing for speed, and the bottleneck features a computation of Cartesian products item-by-item, similar to the pseudocode:
M += J[i,j] * [ Lxi *Lxj + Lyi*Lyj + Lzi*Lzj ]
for all combinations of 0 <= i, j <= n. (J is an n-dimensional square matrix of numbers).
It seems that vectorizing this by computing all the Cartesian products in one step via (pseudocode):
L = [ [Lx1, Lx2, ...Lxn],
[Ly1, Ly2, ...Lyn],
[Lz1, Lz2, ...Lzn] ]
product = L.T * L
would be faster. However, options such as np.bmat, np.vstack, np.hstack seem to require arrays as inputs, and I have lists instead.
Is there a way to efficiently splice the three lists of matrices together into a block? Or, is there a way to generate an array of sparse matrices one element at a time and then np.vstack them together?
Reference: Similar MATLAB code, used to compute the Hamiltonian matrix for n-spin NMR simulation, can be found here:
http://spindynamics.org/Spin-Dynamics---Part-II---Lecture-06.php
This is scipy.sparse.bmat:
L = scipy.sparse.bmat([Lx, Ly, Lz], format='csc')
LT = scipy.sparse.bmat(zip(Lx, Ly, Lz), format='csr') # Not equivalent to L.T
product = LT * L
I have a "vectorized" solution, but it's almost twice as slow as the original code. Both the bottleneck shown above, and the final dot product shown in the last line below, take about 95% of the calculation time according to kernprof tests.
# Create the matrix of column vectors from these lists
L_column = bmat([Lx, Ly, Lz], format='csc')
# Create the matrix of row vectors (via a transpose of matrix with
# transposed blocks)
Lx_trans = [x.T for x in Lx]
Ly_trans = [y.T for y in Ly]
Lz_trans = [z.T for z in Lz]
L_row = bmat([Lx_trans, Ly_trans, Lz_trans], format='csr').T
product = L_row * L_column
I was able to get a tenfold speed increase by not using sparse matrices and using an array of arrays.
Lx = np.empty((1, nspins), dtype='object')
Ly = np.empty((1, nspins), dtype='object')
Lz = np.empty((1, nspins), dtype='object')
These are populated with the individual Lx arrays (formerly sparse matrices) as they are generated. Using the array structure allows the transpose and Cartesian product to perform as desired:
Lcol = np.vstack((Lx, Ly, Lz)).real
Lrow = Lcol.T # As opposed to sparse version of code, this works!
Lproduct = np.dot(Lrow, Lcol)
The individual Lx[n] matrices are still "bundled", so Product is an n x n matrix. This means in-place multiplication of the n x n J array with Lproduct works:
scalars = np.multiply(J, Lproduct)
Each matrix element is then added on to the final hamiltonian matrix:
for n in range(nspins):
for m in range(nspins):
M += scalars[n, k].real

Categories