I think I asked the wrong question yesterday. What I actually want is to mutiply two 2x2xN matrices A and B, so that
C[:,:,i] = dot(A[:,:,i], B[:,:,i])
For example, if I have a matrix
A = np.arange(12).reshape(2, 2, 3)
How can I get C = A x A with the definition described above? Is there a built-in function to do this?
Also, if I multiply A (shape 2x2xN) with B (shape 2x2x1, instead of N), I want to get
C[:,:,i] = dot(A[:,:,i], B[:,:,1])
Try using numpy.einsum, it has a little bit of a learning curve but it should give you what you want. Here is an example to get you started.
import numpy as np
A = np.random.random((2, 2, 3))
B = np.random.random((2, 2, 3))
C1 = np.empty((2, 2, 3))
for i in range(3):
C1[:, :, i] = np.dot(A[:, :, i], B[:, :, i])
C2 = np.einsum('ijn,jkn->ikn', A, B)
np.allclose(C1, C2)
Related
I am attempting a numpy.matmul call using as variables
Matrix A of dimensions (p, t, q)
Matrix B of dimensions (r, t).
A categories vector of shape r and p categories, used to take slices of B and define the index of A do use.
The multiplications are done iteratively using the indices of each category. For each category p_i, I extract from A a submatrix (t, q). Then, I multiply those with a subset of B (x, t), where x is a mask defined by r == p_i. Finally, the matrix multiplication of (x, t) and (t, q) produces the output (x, q) which is stored at S[x].
I have noted that I cannot figure out a non-iterative version of this algorithm. The first snippet describes an iterative solution. The second one is an attempt at what I would wish to get, where everything is calculated as a single-step and would be presumably faster. However, it is incorrect because matrix A has three dimensions instead of two. Maybe there is no way to do this in NumPy with a single call, and in general, looking for advice/ideas to try out.
Thanks!
import numpy as np
p, q, r, t = 2, 9, 512, 4
# data initialization (random)
np.random.seed(500)
S = np.random.rand(r, q)
A = np.random.randint(0, 3, size=(p, t, q))
B = np.random.rand(r, t)
categories = np.random.randint(0, p, r)
print('iterative') # iterative
for i in range(p):
# print(i)
a = A[i, :, :]
mask = categories == i
b = B[mask]
print(b.shape, a.shape, S[mask].shape,
np.matmul(b, a).shape)
S[mask] = np.matmul(b, a)
print(S.shape)
a simple way to write it down
S = np.random.rand(r, q)
print(A[:p,:,:].shape)
result = np.matmul(B, A[:p,:,:])
# iterative assignment
i = 0
S[categories == i] = result[i, categories == i, :]
i = 1
S[categories == i] = result[i, categories == i, :]
The next snippet will produce an error during the multiplication step.
# attempt to multiply once, indexing all categories only once (not possible)
np.random.seed(500)
S = np.random.rand(r, q)
# attempt to use the categories vector
a = A[categories, :, :]
b = B[categories]
# due to the shapes of the arrays, this multiplication is not possible
print('\nsingle step (error due to shapes of the matrix a')
print(b.shape, a.shape, S[categories].shape)
S[categories] = np.matmul(b, a)
print(scores.shape)
iterative
(250, 4) (4, 9) (250, 9) (250, 9)
(262, 4) (4, 9) (262, 9) (262, 9)
(512, 9)
single step (error due to shapes of the 2nd matrix a).
(512, 4) (512, 4, 9) (512, 9)
In [63]: (np.ones((512,4))#np.ones((512,4,9))).shape
Out[63]: (512, 512, 9)
This because the first array is broadcasted to (1,512,4). I think you want instead to do:
In [64]: (np.ones((512,1,4))#np.ones((512,4,9))).shape
Out[64]: (512, 1, 9)
Then remove the middle dimension to get a (512,9).
Another way:
In [72]: np.einsum('ij,ijk->ik', np.ones((512,4)), np.ones((512,4,9))).shape
Out[72]: (512, 9)
To remove the loop altogether, you can try this
bigmask = np.arange(p)[:, np.newaxis] == categories
C = np.matmul(B, A)
res = C[np.broadcast_to(bigmask[..., np.newaxis], C.shape)].reshape(r, q)
# `res` has the same rows as the iterative `S` but in the wrong order
# so we need to reorder the rows
sort_index = np.argsort(np.broadcast_to(np.arange(r), bigmask.shape)[bigmask])
assert np.allclose(S, res[sort_index])
Though I'm not sure it's much faster than the iterative version.
import numpy as np
A = np.empty((0, 3))
temp = np.array([1, 1, 1])
A = np.vstack([A, temp])
A = np.vstack([A, temp])
B = [A]
temp = np.array([2, 2, 0])
A = np.vstack([A, temp])
A = np.vstack([A, temp])
A = np.vstack([A, temp])
A = np.vstack([A, temp])
B = B.append(A)
So it does not work. How do I make a list of numpy arrays? The problem is that I have N types of points. Every type of points has M number of points. Every point is 3 coordinate array. Because I dont know in the first place the values of N and M, I need to do all the things dynamicly. When I had N = 1, vstack worked perfectly, but now every type has its own M, and array is not uniform anymore. So my guess - I need to work in numpy/vstack just as if I had N = 1, but afterwards just contain this np.empty((0, 3)) arrays somewhere. Is it possible? Maybe some empty object-type dictionary?
Thank you very mush in advance!
I have N, 2x4 arrays stored in a (2x4xN) array J. I am trying to calculate the pseudoinverse for each of the N, 2x4 arrays, and save the pseudoinverses to a (N x 4 x 2) array J_pinv.
What I'm currently doing:
J_pinvs = np.zeros((N, 4, 2))
for i in range(N):
J_pinvs[i, :, :] = np.transpose(J[:, :, i]) # np.linalg.inv(J[:, :, i] # J[:, :, i].transpose())
This works but I would like to speed up the compute time as this will be running in a layer of a neural network so I would like to make it as fast as possible.
What I've tried:
J_pinvs = np.zeros((N, 4, 2))
J_pinvs2[:, :, :] = np.transpose(J[:, :, :]) # np.linalg.inv(J[:, :, :] # J[:, :, :].transpose())
Generates the error:
<ipython-input-87-d8ee1ba2ae5e> in <module>
1 J_pinvs2 = np.zeros((4, 2, 3))
----> 2 J_pinvs2[:, :, :] = np.transpose(J[:, :, :]) # np.linalg.inv(J[:, :, :] # J[:, :, :].transpose())
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 4 is different from 3)
Is there a way to do this with slicing so that I don't need to use an iterator? I'm having trouble finding anything online. Any help/suggestions would be appretiated!
Thanks,
JM
I think you need to specify how to transpose a 3-D array:
np.linalg.inv(a # a.transpose(0,2,1))
will work. As oppose to
# sample data
a = np.arange(24).reshape(-1,2,4)
a.shape
# (3, 2, 4)
a.transpose().shape
# (4, 2, 3)
and
a # a.transpose()
will not work.
Finally, the whole script should be:
a.transpose(0,2,1) # np.linalg.inv(a # a.transpose(0,2,1))
Consider the following arrays:
a = np.array([0,1])[:,None]
b = np.array([1,2,3])
print(a)
array([[0],
[1]])
print(b)
b = np.array([1,2,3])
Is there a simple way to concatenate these two arrays in a way that the latter is broadcast, in order to obtain the following?
array([[0, 1, 2, 3],
[1, 1, 2, 3]])
I've seen there is this closed issue with a related question. An alternative is proposed involving np.broadcast_arrays, however I cannot manage to adapt it to my example. Is there some way to do this, excluding the np.tile/np.concatenate solution?
You can do it in the following way
import numpy as np
a = np.array([0,1])[:,None]
b = np.array([1,2,3])
b_new = np.broadcast_to(b,(a.shape[0],b.shape[0]))
c = np.concatenate((a,b_new),axis=1)
print(c)
Here a more general solution:
def concatenate_broadcast(arrays, axis=-1):
def broadcast(x, shape):
shape = [*shape] # weak copy
shape[axis] = x.shape[axis]
return np.broadcast_to(x, shape)
shapes = [list(a.shape) for a in arrays]
for s in shapes:
s[axis] = 1
broadcast_shape = np.broadcast(*[
np.broadcast_to(0, s)
for s in shapes
]).shape
arrays = [broadcast(a, broadcast_shape) for a in arrays]
return np.concatenate(arrays, axis=axis)
I had a similar problem where I had two matrices x and y of size (X, c1) and (Y, c2) and wanted the result to be the matrix of size (X * Y, c1 + c2) where the rows of the result were all the concatenations of rows from x and rows from y.
I, like the original poster, was disappointed to discover that concatenate() would not do broadcasting for me. I thought of using the solution above, except X and Y could potentially be large, and that solution would use a large temporary array.
I finally came up with the following:
result = np.empty((x.shape[0], y.shape[0], x.shape[1] + y.shape[1]), dtype=x.dtype)
result[...,:x.shape[0]] = x[:,None,:]
result[...,x.shape[0]:] = y[None,:,:]
result = result.reshape((-1, x.shape[1] + y.shape[1]))
I create a result array of size (X, Y, c1 + c2), I broadcast in the contents of x and y, and then reshape the results to the right size.
Given the product of a matrix and a vector
A.v
with A of shape (m,n) and v of dim n, where m and n are symbols, I need to calculate the Derivative with respect to the matrix elements.
I haven't found the way to use a proper vector, so I started with 2 MatrixSymbol:
n, m = symbols('n m')
j = tensor.Idx('j')
i = tensor.Idx('i')
l = tensor.Idx('l')
h = tensor.Idx('h')
A = MatrixSymbol('A', n,m)
B = MatrixSymbol('B', m,1)
C=A*B
Now, if I try to derive with respect to one of A's elements with the indices I get back the unevaluated expression:
diff(C, A[i,j])
>>>> Derivative(A*B, A[i, j])
If I introduce the indices in C also (it won't let me use only one index in the resulting vector) I get back the product expressed as a Sum:
C[l,h]
>>>> Sum(A[l, _k]*B[_k, h], (_k, 0, m - 1))
If I derive this with respect to the matrix element I end up getting 0 instead of an expression with the KroneckerDelta, which is the result that I would like to get:
diff(C[l,h], A[i,j])
>>>> 0
I wonder if maybe I shouldn't be using MatrixSymbols to start with. How should I go about implementing the behaviour that I want to get?
SymPy does not yet know matrix calculus; in particular, one cannot differentiate MatrixSymbol objects. You can do this sort of computation with Matrix objects filled with arrays of symbols; the drawback is that the matrix sizes must be explicit for this to work.
Example:
from sympy import *
A = Matrix(symarray('A', (4, 5)))
B = Matrix(symarray('B', (5, 3)))
C = A*B
print(C.diff(A[1, 2]))
outputs:
Matrix([[0, 0, 0], [B_2_0, B_2_1, B_2_2], [0, 0, 0], [0, 0, 0]])
The git version of SymPy (and the next version) handles this better:
In [55]: print(diff(C[l,h], A[i,j]))
Sum(KroneckerDelta(_k, j)*KroneckerDelta(i, l)*B[_k, h], (_k, 0, m - 1))